TY - GEN
T1 - Learning skills to patch plans based on inaccurate models
AU - Lagrassa, Alex
AU - Lee, Steven
AU - Kroemer, Oliver
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/10/24
Y1 - 2020/10/24
N2 - Planners using accurate models can be effective for accomplishing manipulation tasks in the real world, but are typically highly specialized and require significant fine-tuning to be reliable. Meanwhile, learning is useful for adaptation, but can require a substantial amount of data collection. In this paper, we propose a method that improves the efficiency of sub-optimal planners with approximate but simple and fast models by switching to a model-free policy when unexpected transitions are observed. Unlike previous work, our method specifically addresses when the planner fails due to transition model error by patching with a local policy only where needed. First, we use a sub-optimal model-based planner to perform a task until model failure is detected. Next, we learn a local model-free policy from expert demonstrations to complete the task in regions where the model failed. To show the efficacy of our method, we perform experiments with a shape insertion puzzle and compare our results to both pure planning and imitation learning approaches. We then apply our method to a door opening task. Our experiments demonstrate that our patch-enhanced planner performs more reliably than pure planning and with lower overall sample complexity than pure imitation learning.
AB - Planners using accurate models can be effective for accomplishing manipulation tasks in the real world, but are typically highly specialized and require significant fine-tuning to be reliable. Meanwhile, learning is useful for adaptation, but can require a substantial amount of data collection. In this paper, we propose a method that improves the efficiency of sub-optimal planners with approximate but simple and fast models by switching to a model-free policy when unexpected transitions are observed. Unlike previous work, our method specifically addresses when the planner fails due to transition model error by patching with a local policy only where needed. First, we use a sub-optimal model-based planner to perform a task until model failure is detected. Next, we learn a local model-free policy from expert demonstrations to complete the task in regions where the model failed. To show the efficacy of our method, we perform experiments with a shape insertion puzzle and compare our results to both pure planning and imitation learning approaches. We then apply our method to a door opening task. Our experiments demonstrate that our patch-enhanced planner performs more reliably than pure planning and with lower overall sample complexity than pure imitation learning.
UR - https://www.scopus.com/pages/publications/85098417474
UR - https://www.scopus.com/pages/publications/85098417474#tab=citedBy
U2 - 10.1109/IROS45743.2020.9341475
DO - 10.1109/IROS45743.2020.9341475
M3 - Conference contribution
AN - SCOPUS:85098417474
T3 - IEEE International Conference on Intelligent Robots and Systems
SP - 9441
EP - 9448
BT - 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2020
T2 - 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2020
Y2 - 24 October 2020 through 24 January 2021
ER -