Lessons learned in replicating data-driven experiments in multiple medical systems and patient populations.

Samantha Kleinberg, Noémie Elhadad

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

Electronic health records are an increasingly important source of data for research, allowing for large-scale longitudinal studies on the same population that is being treated. Unlike in controlled studies, though, these data vary widely in quality, quantity, and structure. In order to know whether algorithms can accurately uncover new knowledge from these records, or whether findings can be extrapolated to new populations, they must be validated. One approach is to conduct the same study in multiple sites and compare results, but it is a challenge to determine whether differences are due to artifacts of the medical process, population differences, or failures of the methods used. In this paper we describe the results of replicating a data-driven experiment to infer possible causes of congestive heart failure and their timing using data from two medical systems and two patient populations. We focus on the difficulties faced in this type of work, lessons learned, and recommendations for future research.

Original languageEnglish
Pages (from-to)786-795
Number of pages10
JournalAMIA ... Annual Symposium proceedings. AMIA Symposium
Volume2013
StatePublished - 2013

Fingerprint

Dive into the research topics of 'Lessons learned in replicating data-driven experiments in multiple medical systems and patient populations.'. Together they form a unique fingerprint.

Cite this