Can Large Language Models Reason About Goal-Oriented Tasks?

Filippos Bellos, Yayuan Li, Wuao Liu, Jason J. Corso

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Most adults can complete a sequence of steps to achieve a certain goal, such as making a sandwich or repairing a bicycle tire. In completing these goal-oriented tasks, or simply tasks in this paper, one must use sequential reasoning to understand the relationship between the sequence of steps and the goal. LLMs have shown impressive capabilities across various natural language understanding tasks. However, prior work has mainly focused on logical reasoning tasks (e.g. arithmetic, commonsense QA); how well LLMs can perform on more complex reasoning tasks like sequential reasoning is not clear. In this paper, we address this gap and conduct a comprehensive evaluation of how well LLMs are able to conduct this reasoning for tasks and how they scale w.r.t multiple dimensions(e.g. adaptive prompting strategies, number of in-context examples, varying complexity of the sequential task). Our findings reveal that while Chain of Thought (CoT) prompting can significantly enhance LLMs’ sequential reasoning in certain scenarios, it can also be detrimental in others, whereas Tree of Thoughts (ToT) reasoning is less effective for this type of task. Additionally, we discover that an increase in model size or in-context examples does not consistently lead to improved performance.

Original languageEnglish
Title of host publicationSCALE-LLM 2024 - 1st Edition of the Workshop on the Scaling Behavior of Large Language Models, Proceedings of the Workshop
EditorsAntonio Valerio Miceli-Barone, Fazl Barez, Shay B. Cohen, Elena Voita, Ulrich Germann, Michal Lukasik
Pages24-34
Number of pages11
ISBN (Electronic)9798891760776
StatePublished - 2024
Event1st Workshop on the Scaling Behavior of Large Language Models, SCALE-LLM 2024 - St. Julian's, Malta
Duration: 22 Mar 2024 → …

Publication series

NameSCALE-LLM 2024 - 1st Edition of the Workshop on the Scaling Behavior of Large Language Models, Proceedings of the Workshop

Conference

Conference1st Workshop on the Scaling Behavior of Large Language Models, SCALE-LLM 2024
Country/TerritoryMalta
CitySt. Julian's
Period22/03/24 → …

Fingerprint

Dive into the research topics of 'Can Large Language Models Reason About Goal-Oriented Tasks?'. Together they form a unique fingerprint.

Cite this