Control in Stochastic Environment with Delays: A Model-based Reinforcement Learning Approach

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper we are introducing a new reinforcement learning method for control problems in environments with delayed feedback. Specifically, our method employs stochastic planning, versus previous methods that used deterministic planning. This allows us to embed risk preference in the policy optimization problem. We show that this formulation can recover the optimal policy for problems with deterministic transitions. We contrast our policy with two prior methods from literature. We apply the methodology to simple tasks to understand its features. Then, we compare the performance of the methods in controlling multiple Atari games.

Original languageEnglish
Title of host publicationProceedings of the 34th International Conference on Automated Planning and Scheduling, ICAPS 2024
EditorsSara Bernardini, Christian Muise
Pages663-670
Number of pages8
ISBN (Electronic)9781577358893
DOIs
StatePublished - 30 May 2024
Event34th International Conference on Automated Planning and Scheduling, ICAPS 2024 - Banaff, Canada
Duration: 1 Jun 20246 Jun 2024

Publication series

NameProceedings International Conference on Automated Planning and Scheduling, ICAPS
Volume34
ISSN (Print)2334-0835
ISSN (Electronic)2334-0843

Conference

Conference34th International Conference on Automated Planning and Scheduling, ICAPS 2024
Country/TerritoryCanada
CityBanaff
Period1/06/246/06/24

Fingerprint

Dive into the research topics of 'Control in Stochastic Environment with Delays: A Model-based Reinforcement Learning Approach'. Together they form a unique fingerprint.

Cite this