TY - JOUR
T1 - Improving the transferability of adversarial attacks via self-ensemble
AU - Cheng, Shuyan
AU - Li, Peng
AU - Liu, Jianguo
AU - Xu, He
AU - Yao, Yudong
AU - Li, Peng
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.
PY - 2024/11
Y1 - 2024/11
N2 - Abstract: Deep neural networks have been used extensively for diverse visual tasks, including object detection, face recognition, and image classification. However, they face several security threats, such as adversarial attacks. To improve the resistance of neural networks to adversarial attacks, researchers have investigated the security issues of models from the perspectives of both attacks and defenses. Recently, the transferability of adversarial attacks has received extensive attention, which promotes the application of adversarial attacks in practical scenarios. However, existing transferable attacks tend to trap into a poor local optimum and significantly degrade the transferability because the production of adversarial samples lacks randomness. Therefore, we propose a self-ensemble-based feature-level adversarial attack (SEFA) to boost transferability by randomly disrupting salient features. We provide theoretical analysis to demonstrate the superiority of the proposed method. In particular, perturbing the refined feature importance weighted intermediate features suppresses positive features and encourages negative features to realize adversarial attacks. Subsequently, self-ensemble is introduced to solve the optimization problem, thus enhancing the diversity from an optimization perspective. The diverse orthogonal initial perturbations disrupt these features stochastically, searching the space of transferable perturbations exhaustively to avoid poor local optima and improve transferability effectively. Extensive experiments show the effectiveness and superiority of the proposed SEFA, i.e., the success rates against undefended models and defense models are improved by 7.7% and 13.4%, respectively, compared with existing transferable attacks. Our code is available at https://github.com/chengshuyan/SEFA. Graphical abstract: (Figure presented.)
AB - Abstract: Deep neural networks have been used extensively for diverse visual tasks, including object detection, face recognition, and image classification. However, they face several security threats, such as adversarial attacks. To improve the resistance of neural networks to adversarial attacks, researchers have investigated the security issues of models from the perspectives of both attacks and defenses. Recently, the transferability of adversarial attacks has received extensive attention, which promotes the application of adversarial attacks in practical scenarios. However, existing transferable attacks tend to trap into a poor local optimum and significantly degrade the transferability because the production of adversarial samples lacks randomness. Therefore, we propose a self-ensemble-based feature-level adversarial attack (SEFA) to boost transferability by randomly disrupting salient features. We provide theoretical analysis to demonstrate the superiority of the proposed method. In particular, perturbing the refined feature importance weighted intermediate features suppresses positive features and encourages negative features to realize adversarial attacks. Subsequently, self-ensemble is introduced to solve the optimization problem, thus enhancing the diversity from an optimization perspective. The diverse orthogonal initial perturbations disrupt these features stochastically, searching the space of transferable perturbations exhaustively to avoid poor local optima and improve transferability effectively. Extensive experiments show the effectiveness and superiority of the proposed SEFA, i.e., the success rates against undefended models and defense models are improved by 7.7% and 13.4%, respectively, compared with existing transferable attacks. Our code is available at https://github.com/chengshuyan/SEFA. Graphical abstract: (Figure presented.)
KW - Adversarial examples
KW - Black-box attacks
KW - Feature importance
KW - Self-ensemble
KW - Transferability
UR - http://www.scopus.com/inward/record.url?scp=85201815972&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85201815972&partnerID=8YFLogxK
U2 - 10.1007/s10489-024-05728-z
DO - 10.1007/s10489-024-05728-z
M3 - Article
AN - SCOPUS:85201815972
SN - 0924-669X
VL - 54
SP - 10608
EP - 10626
JO - Applied Intelligence
JF - Applied Intelligence
IS - 21
ER -