Sparse Gaussian Process Temporal Difference Learning for Marine Robot Navigation

John Martin, Jinkun Wang, Brendan Englot

Research output: Contribution to journalConference articlepeer-review

5 Scopus citations

Abstract

We present a method for Temporal Difference (TD) learning that addresses several challenges faced by robots learning to navigate in a marine environment. For improved data efficiency, our method reduces TD updates to Gaussian Process regression. To make predictions amenable to online settings, we introduce a sparse approximation with improved quality over current rejection-based methods. We derive the predictive value function posterior and use the moments to obtain a new algorithm for model-free policy evaluation, SPGP-SARSA. With simple changes, we show SPGP-SARSA can be reduced to a model-based equivalent, SPGP-TD. We perform comprehensive simulation studies and also conduct physical learning trials with an underwater robot. Our results show SPGP-SARSA can outperform the state-of-the-art sparse method, replicate the prediction quality of its exact counterpart, and be applied to solve underwater navigation tasks.

Original languageEnglish
Pages (from-to)179-189
Number of pages11
JournalProceedings of Machine Learning Research
Volume87
StatePublished - 2018
Event2nd Conference on Robot Learning, CoRL 2018 - Zurich, Switzerland
Duration: 29 Oct 201831 Oct 2018

Keywords

  • Reinforcement Learning
  • Sparse Gaussian Process Regression

Fingerprint

Dive into the research topics of 'Sparse Gaussian Process Temporal Difference Learning for Marine Robot Navigation'. Together they form a unique fingerprint.

Cite this