Imputation of missing values in time series with lagged correlations

Shah Atiqur Rahman, Yuxiao Huang, Jan Claassen, Samantha Kleinberg

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

Missing values are a common problem in real world data and are particularly prevalent in biomedical time series, where a patient's medical record may be split across multiple institutions or a device may briefly fail. These data are not missing completely at random, so ignoring the missing values can lead to bias and error during data mining. However, current methods for imputing missing values have yet to account for the fact that variables are correlated and that those relationships exist across time. To address this, we propose an imputation method (FLk-NN) that incorporates time lagged correlations both within and across variables by combining two imputation methods, based on an extension to k-NN and the Fourier transform. This enables imputation of missing values even when all data at a time point is missing and when there are different types of missingness both within and across variables. In comparison to other approaches on two biological datasets (simulated glucose in Type 1 diabetes and multi-modality neurological ICU monitoring) the proposed method has the highest imputation accuracy. This was true for up to half the data being missing and when consecutive missing values are a significant fraction of the overall time series length.

Original languageEnglish
Title of host publicationProceedings - 14th IEEE International Conference on Data Mining Workshops, ICDMW 2014
EditorsZhi-Hua Zhou, Wei Wang, Ravi Kumar, Hannu Toivonen, Jian Pei, Joshua Zhexue Huang, Xindong Wu
Pages753-762
Number of pages10
EditionJanuary
ISBN (Electronic)9781479942749
DOIs
StatePublished - 26 Jan 2015
Event14th IEEE International Conference on Data Mining Workshops, ICDMW 2014 - Shenzhen, China
Duration: 14 Dec 2014 → …

Publication series

NameIEEE International Conference on Data Mining Workshops, ICDMW
NumberJanuary
Volume2015-January
ISSN (Print)2375-9232
ISSN (Electronic)2375-9259

Conference

Conference14th IEEE International Conference on Data Mining Workshops, ICDMW 2014
Country/TerritoryChina
CityShenzhen
Period14/12/14 → …

Keywords

  • Fourier imputation
  • correlated data with time-lag
  • extended k-NN imputation
  • missing data

Fingerprint

Dive into the research topics of 'Imputation of missing values in time series with lagged correlations'. Together they form a unique fingerprint.

Cite this