TY - GEN
T1 - Zero-Shot Reinforcement Learning on Graphs for Autonomous Exploration under Uncertainty
AU - Chen, Fanfei
AU - Szenher, Paul
AU - Huang, Yewei
AU - Wang, Jinkun
AU - Shan, Tixiao
AU - Bai, Shi
AU - Englot, Brendan
N1 - Publisher Copyright:
© 2021 IEEE
PY - 2021
Y1 - 2021
N2 - This paper studies the problem of autonomous exploration under localization uncertainty for a mobile robot with 3D range sensing. We present a framework for self-learning a high-performance exploration policy in a single simulation environment, and transferring it to other environments, which may be physical or virtual. Recent work in transfer learning achieves encouraging performance by domain adaptation and domain randomization to expose an agent to scenarios that fill the inherent gaps in sim2sim and sim2real approaches. However, it is inefficient to train an agent in environments with randomized conditions to learn the important features of its current state. An agent can use domain knowledge provided by human experts to learn efficiently. We propose a novel approach that uses graph neural networks in conjunction with deep reinforcement learning, enabling decision-making over graphs containing relevant exploration information provided by human experts to predict a robot's optimal sensing action in belief space. The policy, which is trained only in a single simulation environment, offers a real-time, scalable, and transferable decision-making strategy, resulting in zero-shot transfer to other simulation environments and even real-world environments.
AB - This paper studies the problem of autonomous exploration under localization uncertainty for a mobile robot with 3D range sensing. We present a framework for self-learning a high-performance exploration policy in a single simulation environment, and transferring it to other environments, which may be physical or virtual. Recent work in transfer learning achieves encouraging performance by domain adaptation and domain randomization to expose an agent to scenarios that fill the inherent gaps in sim2sim and sim2real approaches. However, it is inefficient to train an agent in environments with randomized conditions to learn the important features of its current state. An agent can use domain knowledge provided by human experts to learn efficiently. We propose a novel approach that uses graph neural networks in conjunction with deep reinforcement learning, enabling decision-making over graphs containing relevant exploration information provided by human experts to predict a robot's optimal sensing action in belief space. The policy, which is trained only in a single simulation environment, offers a real-time, scalable, and transferable decision-making strategy, resulting in zero-shot transfer to other simulation environments and even real-world environments.
UR - http://www.scopus.com/inward/record.url?scp=85124792138&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85124792138&partnerID=8YFLogxK
U2 - 10.1109/ICRA48506.2021.9561917
DO - 10.1109/ICRA48506.2021.9561917
M3 - Conference contribution
AN - SCOPUS:85124792138
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 5193
EP - 5199
BT - 2021 IEEE International Conference on Robotics and Automation, ICRA 2021
T2 - 2021 IEEE International Conference on Robotics and Automation, ICRA 2021
Y2 - 30 May 2021 through 5 June 2021
ER -