Multi-Robot Guided Policy Search for Learning Decentralized Swarm Control

Chao Jiang, Yi Guo

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

Multi-robot learning has been extensively studied recently. Developing provably-correct algorithms for learning decentralized control policies remains challenging. In this letter, we propose a sample-efficient multi-robot learning method based on guided policy search to learn decentralized swarm control policies. The proposed method uses distributed trajectory optimization to provide guiding trajectory samples for policy training. In turn, the learned policy is exploited to update the trajectory optimization results so that the guiding trajectories are reproducible by the current policy. A learning algorithm is designed to alternate between distributed trajectory optimization and policy optimization, which eventually converges to a solution with good long-term performance. We demonstrate the effectiveness of our method in a multi-robot rendezvous problem. The simulation results in a robot simulator show that our method efficiently learn decentralized control policy with substantially less training samples.

Original languageEnglish
Article number9127548
Pages (from-to)743-748
Number of pages6
JournalIEEE Control Systems Letters
Volume5
Issue number3
DOIs
StatePublished - Jul 2021

Keywords

  • Multi-robot learning
  • distributed trajectory optimization
  • guided policy search
  • robotic swarm

Fingerprint

Dive into the research topics of 'Multi-Robot Guided Policy Search for Learning Decentralized Swarm Control'. Together they form a unique fingerprint.

Cite this