On the Expressivity of Markov Reward (Extended Abstract)

David Abel, Will Dabney, Anna Harutyunyan, Mark K. Ho, Michael L. Littman, Doina Precup, Satinder Singh

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Reward is the driving force for reinforcement-learning agents. We here set out to understand the expressivity of Markov reward as a way to capture tasks that we would want an agent to perform. We frame this study around three new abstract notions of “task”: (1) a set of acceptable behaviors, (2) a partial ordering over behaviors, or (3) a partial ordering over trajectories. Our main results prove that while reward can express many of these tasks, there exist instances of each task type that no Markov reward function can capture. We then provide a set of polynomial-time algorithms that construct a Markov reward function that allows an agent to perform each task type, and correctly determine when no such reward function exists.

Original languageEnglish
Title of host publicationProceedings of the 31st International Joint Conference on Artificial Intelligence, IJCAI 2022
EditorsLuc De Raedt, Luc De Raedt
Pages5254-5258
Number of pages5
ISBN (Electronic)9781956792003
DOIs
StatePublished - 2022
Event31st International Joint Conference on Artificial Intelligence, IJCAI 2022 - Vienna, Austria
Duration: 23 Jul 202229 Jul 2022

Publication series

NameIJCAI International Joint Conference on Artificial Intelligence
ISSN (Print)1045-0823

Conference

Conference31st International Joint Conference on Artificial Intelligence, IJCAI 2022
Country/TerritoryAustria
CityVienna
Period23/07/2229/07/22

Fingerprint

Dive into the research topics of 'On the Expressivity of Markov Reward (Extended Abstract)'. Together they form a unique fingerprint.

Cite this