TY - JOUR
T1 - Using Domain Knowledge to Overcome Latent Variables in Causal Inference from Time Series
AU - Zheng, Min
AU - Kleinberg, Samantha
N1 - Publisher Copyright:
© 2019 M. Zheng & S. Kleinberg.
PY - 2019
Y1 - 2019
N2 - Increasingly large observational datasets from healthcare and social media may allow new types of causal inference. However, these data are often missing key variables, increasing the chance of finding spurious causal relationships due to confounding. While methods exist for causal inference with latent variables in static cases, temporal relationships are more challenging, as varying time lags make latent causes more difficult to uncover and approaches often have significantly higher computational complexity. To address this, we make the key observation that while a variable may be latent in one dataset, it may be observed in another, or we may have domain knowledge about its effects. We propose a computationally efficient method that overcomes latent variables by using prior knowledge to reconstruct data for unobserved variables, while remaining robust to cases when the knowledge is wrong or does not apply. On simulated data, our approach outperforms the state of the art with a lower false discovery rate for causal inference. On real-world data from individuals with Type 1 diabetes, we show that our approach can discover causal relationships involving unmeasured meals and exercise.
AB - Increasingly large observational datasets from healthcare and social media may allow new types of causal inference. However, these data are often missing key variables, increasing the chance of finding spurious causal relationships due to confounding. While methods exist for causal inference with latent variables in static cases, temporal relationships are more challenging, as varying time lags make latent causes more difficult to uncover and approaches often have significantly higher computational complexity. To address this, we make the key observation that while a variable may be latent in one dataset, it may be observed in another, or we may have domain knowledge about its effects. We propose a computationally efficient method that overcomes latent variables by using prior knowledge to reconstruct data for unobserved variables, while remaining robust to cases when the knowledge is wrong or does not apply. On simulated data, our approach outperforms the state of the art with a lower false discovery rate for causal inference. On real-world data from individuals with Type 1 diabetes, we show that our approach can discover causal relationships involving unmeasured meals and exercise.
UR - http://www.scopus.com/inward/record.url?scp=85118737197&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85118737197&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85118737197
VL - 106
SP - 474
EP - 489
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
T2 - 4th Machine Learning for Healthcare Conference, MLHC 2019
Y2 - 9 August 2019 through 10 August 2019
ER -