TY - GEN
T1 - Bagging by design (on the suboptimality of bagging)
AU - Papakonstantinou, Periklis A.
AU - Xu, Jia
AU - Cao, Zhu
N1 - Publisher Copyright:
Copyright © 2014, Association for the Advancement of Artificial Intelligence.
PY - 2014
Y1 - 2014
N2 - Bagging (Breiman 1996) and its variants is one of the most popular methods in aggregating classifiers and regressors. Originally, its analysis assumed that the bootstraps are built from an unlimited, independent source of samples, therefore we call this form of bagging ideal-bagging. However in the real world, base predictors are trained on data subsampled from a limited number of training samples and thus they behave very differently. We analyze the effect of intersections between bootstraps, obtained by subsampling, to train different base predictors. Most importantly, we provide an alternative subsampling method called design-bagging based on a new construction of combinatorial designs, and prove it universally better than bagging. Methodologically, we succeed at this level of generality because we compare the prediction accuracy of bagging and design-bagging relative to the accuracy ideal-bagging. This finds potential applications in more involved bagging-based methods. Our analytical results are backed up by experiments on classification and regression settings.
AB - Bagging (Breiman 1996) and its variants is one of the most popular methods in aggregating classifiers and regressors. Originally, its analysis assumed that the bootstraps are built from an unlimited, independent source of samples, therefore we call this form of bagging ideal-bagging. However in the real world, base predictors are trained on data subsampled from a limited number of training samples and thus they behave very differently. We analyze the effect of intersections between bootstraps, obtained by subsampling, to train different base predictors. Most importantly, we provide an alternative subsampling method called design-bagging based on a new construction of combinatorial designs, and prove it universally better than bagging. Methodologically, we succeed at this level of generality because we compare the prediction accuracy of bagging and design-bagging relative to the accuracy ideal-bagging. This finds potential applications in more involved bagging-based methods. Our analytical results are backed up by experiments on classification and regression settings.
UR - http://www.scopus.com/inward/record.url?scp=84908213497&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84908213497&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84908213497
T3 - Proceedings of the National Conference on Artificial Intelligence
SP - 2041
EP - 2047
BT - Proceedings of the National Conference on Artificial Intelligence
T2 - 28th AAAI Conference on Artificial Intelligence, AAAI 2014, 26th Innovative Applications of Artificial Intelligence Conference, IAAI 2014 and the 5th Symposium on Educational Advances in Artificial Intelligence, EAAI 2014
Y2 - 27 July 2014 through 31 July 2014
ER -