TY - GEN
T1 - Robust versions of the Tukey boxplot with their application to detection of outliers
AU - Shevlyakov, Georgy
AU - Andrea, Kliton
AU - Choudur, Lakshminarayan
AU - Smirnov, Pavel
AU - Ulanov, Alexander
AU - Vassilieva, Natalia
PY - 2013/10/18
Y1 - 2013/10/18
N2 - The need for fast on-line algorithms to analyze high data-rate measurements is a vital element in production settings. Given the ever-increasing number of data sources coupled with increasing complexity of applications, and workload patterns, anomaly detection methods should be light-weight and must operate in real-time. In many modern applications, data arrive in a streaming fashion. Therefore, the underlying assumption of classical methods that the data is a sample from a stable distribution is not valid, and Gaussian and non-parametric based methods such as the control chart and boxplot are inadequate. Streaming data is an ever-changing superposition of distributions. Detection of such changes in real-time is one of the fundamental challenges. We propose low-complexity robust modifications to the conventional Tukey boxplot based on fast highly efficient robust estimates of scale. Results using synthetic as well as real-world data show that our methods outperform the Tukey boxplot and methods based on Gaussian limits.
AB - The need for fast on-line algorithms to analyze high data-rate measurements is a vital element in production settings. Given the ever-increasing number of data sources coupled with increasing complexity of applications, and workload patterns, anomaly detection methods should be light-weight and must operate in real-time. In many modern applications, data arrive in a streaming fashion. Therefore, the underlying assumption of classical methods that the data is a sample from a stable distribution is not valid, and Gaussian and non-parametric based methods such as the control chart and boxplot are inadequate. Streaming data is an ever-changing superposition of distributions. Detection of such changes in real-time is one of the fundamental challenges. We propose low-complexity robust modifications to the conventional Tukey boxplot based on fast highly efficient robust estimates of scale. Results using synthetic as well as real-world data show that our methods outperform the Tukey boxplot and methods based on Gaussian limits.
KW - boxplot
KW - outlier
KW - robustness
UR - http://www.scopus.com/inward/record.url?scp=84890509385&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84890509385&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2013.6638919
DO - 10.1109/ICASSP.2013.6638919
M3 - Conference contribution
AN - SCOPUS:84890509385
SN - 9781479903566
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 6506
EP - 6510
BT - 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings
T2 - 2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013
Y2 - 26 May 2013 through 31 May 2013
ER -