TY - GEN
T1 - Imbalanced learning for pattern recognition
T2 - Unmanned/Unattended Sensors and Sensor Networks VII
AU - He, Haibo
AU - Chen, Sheng
AU - Man, Hong
AU - Desai, Sachi
AU - Quoraishee, Shafik
PY - 2010
Y1 - 2010
N2 - The imbalanced learning problem (learning from imbalanced data) presents a significant new challenge to the pattern recognition and machine learning society because in most instances real-world data is imbalanced. When considering military applications, the imbalanced learning problem becomes much more critical because such skewed distributions normally carry the most interesting and critical information. This critical information is necessary to support the decision-making process in battlefield scenarios, such as anomaly or intrusion detection. The fundamental issue with imbalanced learning is the ability of imbalanced data to compromise the performance of standard learning algorithms, which assume balanced class distributions or equal misclassification penalty costs. Therefore, when presented with complex imbalanced data sets these algorithms may not be able to properly represent the distributive characteristics of the data. In this paper we present an empirical study of several popular imbalanced learning algorithms on an army relevant data set. Specifically we will conduct various experiments with SMOTE (Synthetic Minority Over-Sampling Technique), ADASYN (Adaptive Synthetic Sampling), SMOTEBoost (Synthetic Minority Over-Sampling in Boosting), and AdaCost (Misclassification Cost-Sensitive Boosting method) schemes. Detailed experimental settings and simulation results are presented in this work, and a brief discussion of future research opportunities/challenges is also presented.
AB - The imbalanced learning problem (learning from imbalanced data) presents a significant new challenge to the pattern recognition and machine learning society because in most instances real-world data is imbalanced. When considering military applications, the imbalanced learning problem becomes much more critical because such skewed distributions normally carry the most interesting and critical information. This critical information is necessary to support the decision-making process in battlefield scenarios, such as anomaly or intrusion detection. The fundamental issue with imbalanced learning is the ability of imbalanced data to compromise the performance of standard learning algorithms, which assume balanced class distributions or equal misclassification penalty costs. Therefore, when presented with complex imbalanced data sets these algorithms may not be able to properly represent the distributive characteristics of the data. In this paper we present an empirical study of several popular imbalanced learning algorithms on an army relevant data set. Specifically we will conduct various experiments with SMOTE (Synthetic Minority Over-Sampling Technique), ADASYN (Adaptive Synthetic Sampling), SMOTEBoost (Synthetic Minority Over-Sampling in Boosting), and AdaCost (Misclassification Cost-Sensitive Boosting method) schemes. Detailed experimental settings and simulation results are presented in this work, and a brief discussion of future research opportunities/challenges is also presented.
KW - Adaptive synthetic sampling
KW - Imbalanced learning
KW - Machine learning
KW - Pattern recognition
UR - http://www.scopus.com/inward/record.url?scp=78649863824&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78649863824&partnerID=8YFLogxK
U2 - 10.1117/12.867737
DO - 10.1117/12.867737
M3 - Conference contribution
AN - SCOPUS:78649863824
SN - 9780819483515
T3 - Proceedings of SPIE - The International Society for Optical Engineering
BT - Unmanned/Unattended Sensors and Sensor Networks VII
Y2 - 20 September 2010 through 22 September 2010
ER -