TY - GEN
T1 - Control in Stochastic Environment with Delays
T2 - 34th International Conference on Automated Planning and Scheduling, ICAPS 2024
AU - Yao, Zhiyuan
AU - Florescu, Ionut
AU - Lee, Chihoon
N1 - Publisher Copyright:
Copyright © 2024, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
PY - 2024/5/30
Y1 - 2024/5/30
N2 - In this paper we are introducing a new reinforcement learning method for control problems in environments with delayed feedback. Specifically, our method employs stochastic planning, versus previous methods that used deterministic planning. This allows us to embed risk preference in the policy optimization problem. We show that this formulation can recover the optimal policy for problems with deterministic transitions. We contrast our policy with two prior methods from literature. We apply the methodology to simple tasks to understand its features. Then, we compare the performance of the methods in controlling multiple Atari games.
AB - In this paper we are introducing a new reinforcement learning method for control problems in environments with delayed feedback. Specifically, our method employs stochastic planning, versus previous methods that used deterministic planning. This allows us to embed risk preference in the policy optimization problem. We show that this formulation can recover the optimal policy for problems with deterministic transitions. We contrast our policy with two prior methods from literature. We apply the methodology to simple tasks to understand its features. Then, we compare the performance of the methods in controlling multiple Atari games.
UR - http://www.scopus.com/inward/record.url?scp=85195902356&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85195902356&partnerID=8YFLogxK
U2 - 10.1609/icaps.v34i1.31529
DO - 10.1609/icaps.v34i1.31529
M3 - Conference contribution
AN - SCOPUS:85195902356
T3 - Proceedings International Conference on Automated Planning and Scheduling, ICAPS
SP - 663
EP - 670
BT - Proceedings of the 34th International Conference on Automated Planning and Scheduling, ICAPS 2024
A2 - Bernardini, Sara
A2 - Muise, Christian
Y2 - 1 June 2024 through 6 June 2024
ER -