Skip to main navigation Skip to search Skip to main content

Exploring User Behavior and Validation Proficiency in Assessing Responses From a Conversational Agent

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

Abstract

With the rapid development of large language models (LLMs) like ChatGPT, conversational agents are becoming popular alternatives to traditional search engines. However, the ability to distinguish between replies generated by conversational agents and accurate information, along with user behavior in validating these replies, remains unclear. This study examines users’ behavior and their ability to detect incorrect responses from ChatGPT, both with and without Google search results for validation, through a user study with 15 participants. Participants assessed ChatGPT’s answers to questions about Alzheimer’s Disease, which had an accuracy rate of 93.33% (28/30) and an error rate of 6.67% (2/30). Interestingly, when Google search results were available, participants tended to view both correct and incorrect responses favorably. These findings provide insights into the strategies users employ to validate conversational agents’ responses, highlighting differences in behavior with and without the assistance of search engines.

Original languageEnglish
Title of host publicationApplied Human Factors and Ergonomics International
PublisherAHFE International
Pages139-148
Number of pages10
DOIs
StatePublished - 2025

Publication series

NameApplied Human Factors and Ergonomics International
Volume197
ISSN (Electronic)2771-0718

Keywords

  • Artificial intelligence
  • Computing methodologies
  • Empirical studies in HCI
  • Human-centered computing
  • Human-computer interaction (HCI)

Fingerprint

Dive into the research topics of 'Exploring User Behavior and Validation Proficiency in Assessing Responses From a Conversational Agent'. Together they form a unique fingerprint.

Cite this