Agent Learning as a Control Problem
David Cooper, Advanced Technology Labs, Lockheed Martin
Software solutions for Net-Centric Logistics and other defense applications need to behave predictably while remaining flexible in a dynamic environment. Planning for all contingencies guarantees system stability, but that method is infeasible for most real world applications. Problems could range from a supplier not having enough supplies, to a supplier having too many locations to supply in too short of time, to a supplier being unable to reach a supply point. Adaptive systems offer an approach to dealing with these unpredictable contingencies. Machine learning algorithms are effective at adapting system behavior. However, learning technologies such as Reinforcement Learning and Logistic Regression can be constrained to guarantee convergence, but when learning multiple tasks, there is a potential for the combined learning to cause undesired behavior. It is important to have a way to determine what kind of adaptation is needed, and to use learning in order to perform the adaptation, and to address stability of the adaptive system.
Lockheed Martin Advanced Technology Laboratories engineers have developed an adaptive agent architecture based on theories of cognitive development. This architecture provides a control structure to intelligently modify a wide variety of behaviors in order to maintain and improve performance in dynamic environments. These modifications are accomplished with learning techniques, including Genetic Programming and Reinforcement Learning. Our approach is based on Jean Piaget's adaptation mechanisms from his Cognitive Stage Theory of Development. This paper outlines the concepts behind Piaget's adaptation mechanisms, how these concepts are being applied to create a cognitive, adaptive agent architecture, and ways that control theory technologies may help to keep the system stable without over-constraining its potential for learning.