TY - JOUR
T1 - CURENet
T2 - combining unified representations for efficient chronic disease prediction
AU - Dao, Cong Tinh
AU - Phan, Nguyen Minh Thao
AU - Ding, Jun En
AU - Wu, Chenwei
AU - Restrepo, David
AU - Luo, Dongsheng
AU - Zhao, Fanyi
AU - Liao, Chun Chieh
AU - Peng, Wen Chih
AU - Wang, Chi Te
AU - Chen, Pei Fu
AU - Chen, Ling
AU - Ju, Xinglong
AU - Liu, Feng
AU - Hung, Fang Ming
N1 - Publisher Copyright:
© The Author(s) 2025.
PY - 2026/12
Y1 - 2026/12
N2 - Electronic health records (EHRs) are designed to synthesize diverse data types, including unstructured clinical notes, structured lab tests, and time-series visit data. Physicians draw on these multimodal and temporal sources of EHR data to form a comprehensive view of a patient’s health, which is crucial for informed therapeutic decision-making. Yet, most predictive models fail to fully capture the interactions, redundancies, and temporal patterns across multiple data modalities, often focusing on a single data type or overlooking these complexities. In this paper, we present CURENet, a multimodal model (Combining Unified Representations for Efficient chronic disease prediction) that integrates unstructured clinical notes, lab tests, and patients’ time-series data by utilizing large language models (LLMs) for clinical text processing and textual lab tests, as well as transformer encoders for longitudinal sequential visits. Curenet has been capable of capturing the intricate interaction between different forms of clinical data and creating a more reliable predictive model for chronic illnesses. We evaluated CURENet using the public MIMIC-III and private FEMH datasets, where it achieved over 94% accuracy in predicting the top 10 chronic conditions in a multi-label framework. Our findings highlight the potential of multimodal EHR integration to enhance clinical decision-making and improve patient outcomes.
AB - Electronic health records (EHRs) are designed to synthesize diverse data types, including unstructured clinical notes, structured lab tests, and time-series visit data. Physicians draw on these multimodal and temporal sources of EHR data to form a comprehensive view of a patient’s health, which is crucial for informed therapeutic decision-making. Yet, most predictive models fail to fully capture the interactions, redundancies, and temporal patterns across multiple data modalities, often focusing on a single data type or overlooking these complexities. In this paper, we present CURENet, a multimodal model (Combining Unified Representations for Efficient chronic disease prediction) that integrates unstructured clinical notes, lab tests, and patients’ time-series data by utilizing large language models (LLMs) for clinical text processing and textual lab tests, as well as transformer encoders for longitudinal sequential visits. Curenet has been capable of capturing the intricate interaction between different forms of clinical data and creating a more reliable predictive model for chronic illnesses. We evaluated CURENet using the public MIMIC-III and private FEMH datasets, where it achieved over 94% accuracy in predicting the top 10 chronic conditions in a multi-label framework. Our findings highlight the potential of multimodal EHR integration to enhance clinical decision-making and improve patient outcomes.
KW - Electronic Health Records
KW - Large Language Model fine-tuning
KW - Multi-Disease prediction
KW - Transformer
UR - https://www.scopus.com/pages/publications/105023388044
UR - https://www.scopus.com/pages/publications/105023388044#tab=citedBy
U2 - 10.1007/s13755-025-00396-w
DO - 10.1007/s13755-025-00396-w
M3 - Article
AN - SCOPUS:105023388044
VL - 14
JO - Health Information Science and Systems
JF - Health Information Science and Systems
IS - 1
M1 - 7
ER -