TY - GEN
T1 - On the Power of Localized Perceptron for Label-Optimal Learning of Halfspaces with Adversarial Noise
AU - Shen, Jie
N1 - Publisher Copyright:
Copyright © 2021 by the author(s)
PY - 2021
Y1 - 2021
N2 - We study online active learning of homogeneous halfspaces in Rd with adversarial noise where the overall probability of a noisy label is constrained to be at most ν. Our main contribution is a Perceptron-like online active learning algorithm that runs in polynomial time, and under the conditions that the marginal distribution is isotropic log-concave and ν = Ω(ε), where ε∈ (0, 1) is the target error rate, our algorithm PAC learns the underlying halfspace with near-optimal label complexity of Õ(d·polylog(1 ε )) and sample complexity of Õ( d ε).1 Prior to this work, existing online algorithms designed for tolerating the adversarial noise are subject to either label complexity polynomial in 1 ε, or suboptimal noise tolerance, or restrictive marginal distributions. With the additional prior knowledge that the underlying halfspace is s-sparse, we obtain attribute-efficient label complexity of Õ(s · polylog(d, 1 ε )) and sample complexity of Õ(s ε · polylog(d)). As an immediate corollary, we show that under the agnostic model where no assumption is made on the noise rate ν, our active learner achieves an error rate of O(OPT) + ε with the same running time and label and sample complexity, where OPT is the best possible error rate achievable by any homogeneous halfspace.
AB - We study online active learning of homogeneous halfspaces in Rd with adversarial noise where the overall probability of a noisy label is constrained to be at most ν. Our main contribution is a Perceptron-like online active learning algorithm that runs in polynomial time, and under the conditions that the marginal distribution is isotropic log-concave and ν = Ω(ε), where ε∈ (0, 1) is the target error rate, our algorithm PAC learns the underlying halfspace with near-optimal label complexity of Õ(d·polylog(1 ε )) and sample complexity of Õ( d ε).1 Prior to this work, existing online algorithms designed for tolerating the adversarial noise are subject to either label complexity polynomial in 1 ε, or suboptimal noise tolerance, or restrictive marginal distributions. With the additional prior knowledge that the underlying halfspace is s-sparse, we obtain attribute-efficient label complexity of Õ(s · polylog(d, 1 ε )) and sample complexity of Õ(s ε · polylog(d)). As an immediate corollary, we show that under the agnostic model where no assumption is made on the noise rate ν, our active learner achieves an error rate of O(OPT) + ε with the same running time and label and sample complexity, where OPT is the best possible error rate achievable by any homogeneous halfspace.
UR - http://www.scopus.com/inward/record.url?scp=85161285406&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85161285406&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85161285406
T3 - Proceedings of Machine Learning Research
SP - 9503
EP - 9514
BT - Proceedings of the 38th International Conference on Machine Learning, ICML 2021
T2 - 38th International Conference on Machine Learning, ICML 2021
Y2 - 18 July 2021 through 24 July 2021
ER -