Abstract
Efficiently and accurately retrieving specific information from healthcare datasets, such as the Vaccine Adverse Event Reporting System (VAERS) 1, presents significant challenges. A promising solution to this problem is the Text-to-ESQ approach, which is akin to Text-to-SQL tasks but leverages NoSQL database Elasticsearch, to thoroughly explore VAERS data. Non-relational databases are particularly adept at managing complex and dynamic data formats, thereby enabling the extraction of more valuable insights. However, generating executable NoSQL queries is still challenging due to the limited availability of NoSQL query datasets, which constrains model training. One potential remedy involves the use of large language models (LLMs), which can be applied in few-shot and even zero-shot learning scenarios. Nonetheless, the lack of prior evaluation for this novel task, coupled with the absence of a comprehensive, unbiased assessment of existing LLMs and prompting strategies, impedes the development of a robust architecture. Motivated by these challenges, we introduce a new Instruction-Enhanced Explainable (InstructEx) Chain-of-Thought (CoT) prompting by integrating existing CoT prompts and conducting a comprehensive investigation of LLMs and CoT prompting. The extensive experimental analysis demonstrates the effectiveness of using LLMs for Text-to-ESQ when combined with the InstructExCoT prompting. It also sheds light on the strengths and weaknesses of these methods from multiple perspectives.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024 |
| Editors | Mario Cannataro, Huiru Zheng, Lin Gao, Jianlin Cheng, Joao Luis de Miranda, Ester Zumpano, Xiaohua Hu, Young-Rae Cho, Taesung Park |
| Pages | 5174-5181 |
| Number of pages | 8 |
| ISBN (Electronic) | 9798350386226 |
| DOIs | |
| State | Published - 2024 |
| Event | 2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024 - Lisbon, Portugal Duration: 3 Dec 2024 → 6 Dec 2024 |
Publication series
| Name | Proceedings - 2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024 |
|---|
Conference
| Conference | 2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024 |
|---|---|
| Country/Territory | Portugal |
| City | Lisbon |
| Period | 3/12/24 → 6/12/24 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Keywords
- Elasticsearch query
- Natural language querying
- NoSQL
- Text-to-ESQ
- VAERS
Fingerprint
Dive into the research topics of 'Natural Language Querying on Domain-Specific NoSQL Database with Large Language Models'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver