Robust Unmanned Surface Vehicle Navigation with Distributional Reinforcement Learning

Xi Lin, John McConnell, Brendan Englot

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

Autonomous navigation of Unmanned Surface Vehicles (USV) in marine environments with current flows is challenging, and few prior works have addressed the sensor-based navigation problem in such environments under no prior knowledge of the current flow and obstacles. We propose a Distributional Reinforcement Learning (RL) based local path planner that learns return distributions which capture the uncertainty of action outcomes, and an adaptive algorithm that automatically tunes the level of sensitivity to the risk in the environment. The proposed planner achieves a more stable learning performance and converges to safer policies than a traditional RL based planner. Computational experiments demonstrate that comparing to a traditional RL based planner and classical local planning methods such as Artificial Potential Fields and the Bug Algorithm, the proposed planner is robust against environmental flows, and is able to plan trajectories that are superior in safety, time and energy consumption.

Original languageEnglish
Title of host publication2023 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2023
Pages6185-6191
Number of pages7
ISBN (Electronic)9781665491907
DOIs
StatePublished - 2023
Event2023 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2023 - Detroit, United States
Duration: 1 Oct 20235 Oct 2023

Publication series

NameIEEE International Conference on Intelligent Robots and Systems
ISSN (Print)2153-0858
ISSN (Electronic)2153-0866

Conference

Conference2023 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2023
Country/TerritoryUnited States
CityDetroit
Period1/10/235/10/23

Fingerprint

Dive into the research topics of 'Robust Unmanned Surface Vehicle Navigation with Distributional Reinforcement Learning'. Together they form a unique fingerprint.

Cite this