TY - GEN
T1 - Combat robot strategy adaptation using multiple learning agents
AU - Recchia, Thomas
AU - Chung, Jae
AU - Pochiraju, Kishore
PY - 2012
Y1 - 2012
N2 - As robotic systems become more prevalent, it is highly desirable for them to be able to operate in highly dynamic environments. A common approach is to use reinforcement learning to allow an agent controlling the robot to learn and adapt its behavior based on a reward function. This paper presents a novel multi-agent system that cooperates to control a single robot battle tank in a melee battle scenario, with no prior knowledge of its opponents' strategies. The agents learn through reinforcement learning, and are loosely coupled by their reward functions. Each agent controls a different aspect of the robot's behavior. In addition, the problem of delayed reward is addressed through a time-averaged reward applied to several sequential actions at once. This system was evaluated in a simulated melee combat scenario and was shown to learn to improve its performance over time. This was accomplished by each agent learning to pick specific battle strategies for each different opponent it faced.
AB - As robotic systems become more prevalent, it is highly desirable for them to be able to operate in highly dynamic environments. A common approach is to use reinforcement learning to allow an agent controlling the robot to learn and adapt its behavior based on a reward function. This paper presents a novel multi-agent system that cooperates to control a single robot battle tank in a melee battle scenario, with no prior knowledge of its opponents' strategies. The agents learn through reinforcement learning, and are loosely coupled by their reward functions. Each agent controls a different aspect of the robot's behavior. In addition, the problem of delayed reward is addressed through a time-averaged reward applied to several sequential actions at once. This system was evaluated in a simulated melee combat scenario and was shown to learn to improve its performance over time. This was accomplished by each agent learning to pick specific battle strategies for each different opponent it faced.
UR - http://www.scopus.com/inward/record.url?scp=84887313320&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84887313320&partnerID=8YFLogxK
U2 - 10.1115/IMECE2012-87521
DO - 10.1115/IMECE2012-87521
M3 - Conference contribution
AN - SCOPUS:84887313320
SN - 9780791845202
T3 - ASME International Mechanical Engineering Congress and Exposition, Proceedings (IMECE)
SP - 305
EP - 313
BT - Dynamics, Control and Uncertainty
T2 - ASME 2012 International Mechanical Engineering Congress and Exposition, IMECE 2012
Y2 - 9 November 2012 through 15 November 2012
ER -