TY - GEN
T1 - Analyzing and Defending against Membership Inference Attacks in Natural Language Processing Classification
AU - Wang, Yijue
AU - Xu, Nuo
AU - Huang, Shaoyi
AU - Mahmood, Kaleel
AU - Guo, Dan
AU - Ding, Caiwen
AU - Wen, Wujie
AU - Rajasekaran, Sanguthevar
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - The risk posed by Membership Inference Attack (MIA) to deep learning models for Computer Vision (CV) tasks is well known, but MIA has not been addressed or explored fully in the Natural Language Processing (NLP) domain. In this work, we analyze the security risk posed by MIA to NLP models. We show that NLP models are at great risk to MIA, in some cases even more so than models trained on Computer Vision (CV) datasets. This includes an 8.04% increase in attack success rate on average for NLP models (as compared to CV models and datasets). We determine that there are some unique issues in NLP classification tasks in terms of model overfitting, model complexity, and data diversity that make the privacy leakage severe and very different from CV classification tasks. Based on these findings, we propose a novel defense algorithm - Gap score Regularization Integrated Pruning (GRIP), which can protect NLP models against MIA and achieve competitive testing accuracy. Our experimental results show that GRIP can decrease the MIA success rate by as much as 31.25% when compared to the undefended model. In addition, when compared to differential privacy, GRIP offers 7.81% more robustness to MIA and 13.24% higher testing accuracy. Overall our experimental results span four NLP and two CV datasets, and are tested with a total of five different model architectures.
AB - The risk posed by Membership Inference Attack (MIA) to deep learning models for Computer Vision (CV) tasks is well known, but MIA has not been addressed or explored fully in the Natural Language Processing (NLP) domain. In this work, we analyze the security risk posed by MIA to NLP models. We show that NLP models are at great risk to MIA, in some cases even more so than models trained on Computer Vision (CV) datasets. This includes an 8.04% increase in attack success rate on average for NLP models (as compared to CV models and datasets). We determine that there are some unique issues in NLP classification tasks in terms of model overfitting, model complexity, and data diversity that make the privacy leakage severe and very different from CV classification tasks. Based on these findings, we propose a novel defense algorithm - Gap score Regularization Integrated Pruning (GRIP), which can protect NLP models against MIA and achieve competitive testing accuracy. Our experimental results show that GRIP can decrease the MIA success rate by as much as 31.25% when compared to the undefended model. In addition, when compared to differential privacy, GRIP offers 7.81% more robustness to MIA and 13.24% higher testing accuracy. Overall our experimental results span four NLP and two CV datasets, and are tested with a total of five different model architectures.
UR - http://www.scopus.com/inward/record.url?scp=85147897084&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85147897084&partnerID=8YFLogxK
U2 - 10.1109/BigData55660.2022.10020711
DO - 10.1109/BigData55660.2022.10020711
M3 - Conference contribution
AN - SCOPUS:85147897084
T3 - Proceedings - 2022 IEEE International Conference on Big Data, Big Data 2022
SP - 5823
EP - 5832
BT - Proceedings - 2022 IEEE International Conference on Big Data, Big Data 2022
A2 - Tsumoto, Shusaku
A2 - Ohsawa, Yukio
A2 - Chen, Lei
A2 - Van den Poel, Dirk
A2 - Hu, Xiaohua
A2 - Motomura, Yoichi
A2 - Takagi, Takuya
A2 - Wu, Lingfei
A2 - Xie, Ying
A2 - Abe, Akihiro
A2 - Raghavan, Vijay
T2 - 2022 IEEE International Conference on Big Data, Big Data 2022
Y2 - 17 December 2022 through 20 December 2022
ER -