TY - JOUR
T1 - Attribute-Efficient PAC Learning of Low-Degree Polynomial Threshold Functions with Nasty Noise
AU - Zeng, Shiwei
AU - Shen, Jie
N1 - Publisher Copyright:
© 2023 Proceedings of Machine Learning Research. All rights reserved.
PY - 2023
Y1 - 2023
N2 - The concept class of low-degree polynomial threshold functions (PTFs) plays a fundamental role in machine learning. In this paper, we study PAC learning of K-sparse degree-d PTFs on Rn, where any such concept depends only on K out of n attributes of the input. Our main contribution is a new algorithm that runs in time (nd/∊)O(d) and under the Gaussian marginal distribution, PAC learns the class up to error rate ∊ with O(K∊24dd · log5d n) samples even when an η ≤ O(∊d) fraction of them are corrupted by the nasty noise of Bshouty et al. (2002), possibly the strongest corruption model. Prior to this work, attribute-efficient robust algorithms are established only for the special case of sparse homogeneous halfspaces. Our key ingredients are: 1) a structural result that translates the attribute sparsity to a sparsity pattern of the Chow vector under the basis of Hermite polynomials, and 2) a novel attribute-efficient robust Chow vector estimation algorithm which uses exclusively a restricted Frobenius norm to either certify a good approximation or to validate a sparsity-induced degree-2d polynomial as a filter to detect corrupted samples.
AB - The concept class of low-degree polynomial threshold functions (PTFs) plays a fundamental role in machine learning. In this paper, we study PAC learning of K-sparse degree-d PTFs on Rn, where any such concept depends only on K out of n attributes of the input. Our main contribution is a new algorithm that runs in time (nd/∊)O(d) and under the Gaussian marginal distribution, PAC learns the class up to error rate ∊ with O(K∊24dd · log5d n) samples even when an η ≤ O(∊d) fraction of them are corrupted by the nasty noise of Bshouty et al. (2002), possibly the strongest corruption model. Prior to this work, attribute-efficient robust algorithms are established only for the special case of sparse homogeneous halfspaces. Our key ingredients are: 1) a structural result that translates the attribute sparsity to a sparsity pattern of the Chow vector under the basis of Hermite polynomials, and 2) a novel attribute-efficient robust Chow vector estimation algorithm which uses exclusively a restricted Frobenius norm to either certify a good approximation or to validate a sparsity-induced degree-2d polynomial as a filter to detect corrupted samples.
UR - http://www.scopus.com/inward/record.url?scp=85174422840&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85174422840&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85174422840
VL - 202
SP - 40719
EP - 40748
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
T2 - 40th International Conference on Machine Learning, ICML 2023
Y2 - 23 July 2023 through 29 July 2023
ER -