TY - GEN
T1 - Performance analysis of hierarchical reinforcement learning framework for stochastic space logistics
AU - Takubo, Yuji
AU - Chen, Hao
AU - Ho, Koki
N1 - Publisher Copyright:
© 2020 The MITRE Corporation. All Rights Reserved.
PY - 2020
Y1 - 2020
N2 - This paper analyzes a hierarchical reinforcement learning architecture for long-term space campaign designs, which can consider the stochastic parameters of mission scenarios and future influence of the space infrastructure that is deployed in the earlier mission for resource utilization. In the hierarchical framework, we have three levels of the decision-making process: vehicle design via value function approximation (i.e., vehicle design agent), infrastructure deployment mission planning via a reinforcement learning algorithm (i.e., infrastructure deployment agent), and space transportation scheduling via mixed-integer linear programming; these three levels are used iteratively to find the optimal mission design and vehicle/infrastructure sizing which minimize the total campaign cost. Additionally, an asynchronous pre-training phase prior to the dual agent learning phase is introduced, where each agent respectively pre-learns the sub-optimal policy beforehand so that the agents can run efficiently in the dual agent learning phase. The framework overcomes the difficulty in solving a robust design solution for space campaign design under uncertainty, and it is also flexible enough to incorporate various reinforcement learning algorithms. As a case study, the framework is applied to a set of lunar space campaign scenarios with potential resource utilization capabilities. Also, representative state-of-the-art reinforcement learning algorithms are integrated into this framework for comparison. The results show that the deterministic Actor-Critic reinforcement learning algorithms outperform other tested algorithms for the considered space campaign design.
AB - This paper analyzes a hierarchical reinforcement learning architecture for long-term space campaign designs, which can consider the stochastic parameters of mission scenarios and future influence of the space infrastructure that is deployed in the earlier mission for resource utilization. In the hierarchical framework, we have three levels of the decision-making process: vehicle design via value function approximation (i.e., vehicle design agent), infrastructure deployment mission planning via a reinforcement learning algorithm (i.e., infrastructure deployment agent), and space transportation scheduling via mixed-integer linear programming; these three levels are used iteratively to find the optimal mission design and vehicle/infrastructure sizing which minimize the total campaign cost. Additionally, an asynchronous pre-training phase prior to the dual agent learning phase is introduced, where each agent respectively pre-learns the sub-optimal policy beforehand so that the agents can run efficiently in the dual agent learning phase. The framework overcomes the difficulty in solving a robust design solution for space campaign design under uncertainty, and it is also flexible enough to incorporate various reinforcement learning algorithms. As a case study, the framework is applied to a set of lunar space campaign scenarios with potential resource utilization capabilities. Also, representative state-of-the-art reinforcement learning algorithms are integrated into this framework for comparison. The results show that the deterministic Actor-Critic reinforcement learning algorithms outperform other tested algorithms for the considered space campaign design.
UR - http://www.scopus.com/inward/record.url?scp=85097681399&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85097681399&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85097681399
SN - 9781624106088
T3 - Accelerating Space Commerce, Exploration, and New Discovery Conference, ASCEND 2020
BT - Accelerating Space Commerce, Exploration, and New Discovery Conference, ASCEND 2020
T2 - Accelerating Space Commerce, Exploration, and New Discovery Conference, ASCEND 2020
Y2 - 16 November 2020 through 19 November 2020
ER -