Robust Route Planning with Distributional Reinforcement Learning in a Stochastic Road Network Environment

Xi Lin, Paul Szenher, John D. Martin, Brendan Englot

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Route planning is essential to mobile robot navigation problems. In recent years, deep reinforcement learning (DRL) has been applied to learning optimal planning policies in stochastic environments without prior knowledge. However, existing works focus on learning policies that maximize the expected return, the performance of which can vary greatly when the level of stochasticity in the environment is high. In this work, we propose a distributional reinforcement learning based framework that learns return distributions which explicitly reflect environmental stochasticity. Policies based on the second-order stochastic dominance (SSD) relation can be used to make adjustable route decisions according to user preference on performance robustness. Our proposed method is evaluated in a simulated road network environment, and experimental results show that our method is able to plan the shortest routes that minimize stochasticity in travel time when robustness is preferred, while other state-of-the-art DRL methods are agnostic to environmental stochasticity.

Original languageEnglish
Title of host publication2023 20th International Conference on Ubiquitous Robots, UR 2023
Pages287-294
Number of pages8
ISBN (Electronic)9798350335170
DOIs
StatePublished - 2023
Event20th International Conference on Ubiquitous Robots, UR 2023 - Honolulu, United States
Duration: 25 Jun 202328 Jun 2023

Publication series

Name2023 20th International Conference on Ubiquitous Robots, UR 2023

Conference

Conference20th International Conference on Ubiquitous Robots, UR 2023
Country/TerritoryUnited States
CityHonolulu
Period25/06/2328/06/23

Fingerprint

Dive into the research topics of 'Robust Route Planning with Distributional Reinforcement Learning in a Stochastic Road Network Environment'. Together they form a unique fingerprint.

Cite this