TY - GEN
T1 - MEDFuse
T2 - 33rd ACM International Conference on Information and Knowledge Management, CIKM 2024
AU - Thao, Phan Nguyen Minh
AU - Dao, Cong Tinh
AU - Wu, Chenwei
AU - Wang, Jian Zhe
AU - Liu, Shun
AU - Ding, Jun En
AU - Restrepo, David
AU - Liu, Feng
AU - Hung, Fang Ming
AU - Peng, Wen Chih
N1 - Publisher Copyright:
© 2024 ACM.
PY - 2024/10/21
Y1 - 2024/10/21
N2 - Electronic health records (EHRs) are multimodal by nature, consisting of structured tabular features like lab tests and unstructured clinical notes. In real-life clinical practice, doctors use complementary multimodal EHR data sources to get a clearer picture of patients' health and support clinical decision-making. However, most EHR predictive models do not reflect these procedures, as they either focus on a single modality or overlook the inter-modality interactions/redundancy. In this work, we propose MEDFuse, a Multimodal EHR Data Fusion framework that incorporates masked lab-test modeling and large language models (LLMs) to effectively integrate structured and unstructured medical data. MEDFuse leverages multimodal embeddings extracted from two sources: LLMs fine-tuned on free clinical text and masked tabular transformers trained on structured lab test results. We design a disentangled transformer module, optimized by a mutual information loss to 1) decouple modality-specific and modality-shared information and 2) extract useful joint representation from the noise and redundancy present in clinical notes. Through comprehensive validation on the public MIMIC-III dataset and the in-house FEMH dataset, MEDFuse demonstrates great potential in advancing clinical predictions, achieving over 90% F1 score in the 10-disease multi-label classification task.
AB - Electronic health records (EHRs) are multimodal by nature, consisting of structured tabular features like lab tests and unstructured clinical notes. In real-life clinical practice, doctors use complementary multimodal EHR data sources to get a clearer picture of patients' health and support clinical decision-making. However, most EHR predictive models do not reflect these procedures, as they either focus on a single modality or overlook the inter-modality interactions/redundancy. In this work, we propose MEDFuse, a Multimodal EHR Data Fusion framework that incorporates masked lab-test modeling and large language models (LLMs) to effectively integrate structured and unstructured medical data. MEDFuse leverages multimodal embeddings extracted from two sources: LLMs fine-tuned on free clinical text and masked tabular transformers trained on structured lab test results. We design a disentangled transformer module, optimized by a mutual information loss to 1) decouple modality-specific and modality-shared information and 2) extract useful joint representation from the noise and redundancy present in clinical notes. Through comprehensive validation on the public MIMIC-III dataset and the in-house FEMH dataset, MEDFuse demonstrates great potential in advancing clinical predictions, achieving over 90% F1 score in the 10-disease multi-label classification task.
KW - computer-aided diagnosis
KW - electronic health records
KW - large language model fine-tuning
UR - http://www.scopus.com/inward/record.url?scp=85210010348&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85210010348&partnerID=8YFLogxK
U2 - 10.1145/3627673.3679962
DO - 10.1145/3627673.3679962
M3 - Conference contribution
AN - SCOPUS:85210010348
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 3974
EP - 3978
BT - CIKM 2024 - Proceedings of the 33rd ACM International Conference on Information and Knowledge Management
Y2 - 21 October 2024 through 25 October 2024
ER -