TY - JOUR
T1 - Counter-adversarial training data distribution validation
AU - Indyk, Ihor
AU - Zabarankin, Michael
N1 - Publisher Copyright:
© The Author(s) 2025.
PY - 2025
Y1 - 2025
N2 - Many machine learning algorithms lack a procedure for testing/validating the integrity of training data so that attacks on training data remain a simple but yet effective way to derail those algorithms—they can result in devastating consequences, particularly in security-sensitive systems. The difficulty in designing an effective testing procedure is that the distribution of the training data may change naturally with time. Thus, the question is how to distinguish a natural change in the distribution and a change due to adversary’s attack. This work offers a partial answer with a modification of the existing conditional generative adversarial networks (CGAN) framework, where a generative model aims not only to avoid detection by a discriminator but also to design a poisoning sample which will result in the largest prediction error if a classifier is trained on the data that includes the sample. Simultaneous training of the validating and generative models results in a procedure that detects corrupted data which is nearly as optimal as the poisoning sample. The modified CGAN framework is then specialized for support vector machines (SVMs) and general classification models.
AB - Many machine learning algorithms lack a procedure for testing/validating the integrity of training data so that attacks on training data remain a simple but yet effective way to derail those algorithms—they can result in devastating consequences, particularly in security-sensitive systems. The difficulty in designing an effective testing procedure is that the distribution of the training data may change naturally with time. Thus, the question is how to distinguish a natural change in the distribution and a change due to adversary’s attack. This work offers a partial answer with a modification of the existing conditional generative adversarial networks (CGAN) framework, where a generative model aims not only to avoid detection by a discriminator but also to design a poisoning sample which will result in the largest prediction error if a classifier is trained on the data that includes the sample. Simultaneous training of the validating and generative models results in a procedure that detects corrupted data which is nearly as optimal as the poisoning sample. The modified CGAN framework is then specialized for support vector machines (SVMs) and general classification models.
KW - Adversarial ML
KW - GAN
KW - Machine learning
UR - https://www.scopus.com/pages/publications/105019510922
UR - https://www.scopus.com/pages/publications/105019510922#tab=citedBy
U2 - 10.1007/s11590-025-02253-x
DO - 10.1007/s11590-025-02253-x
M3 - Article
AN - SCOPUS:105019510922
SN - 1862-4472
JO - Optimization Letters
JF - Optimization Letters
ER -