TY - JOUR
T1 - Black-box attacks on image classification model with advantage actor-critic algorithm in latent space
AU - Kang, Xu
AU - Song, Bin
AU - Guo, Jie
AU - Qin, Hao
AU - Du, Xiaojiang
AU - Guizani, Mohsen
N1 - Publisher Copyright:
© 2023
PY - 2023/5
Y1 - 2023/5
N2 - The Internet of Things (IoT) ecosystem that integrates a wide variety of intelligent multimedia applications and services has undergone a tremendous transformation over the years. As an essential approach for securing IoT-based multimedia services, Artificial Intelligence (AI) has been in innovation at a rapid pace. However, many machine learning systems are vulnerable to adversarial examples, including advanced deep neural networks. By making imperceptible modifications to real examples, the prediction will deviate far from the correct value. A deep reinforcement learning-based black-box attacker on the image classification model is introduced in our research. Different from the existing black-box attacks, which require massive queries and trials in the pixel space, the proposed method compresses the images into latent space through variational inference, querying the optimal examples efficiently with actor-critic networks. Rather than patch-to-patch translation with generative adversarial networks in related works, the fake examples are generated by gradually superimposing the perturbations into the latent space at each step through the Markov decision process (MDP) to high stability and good astringency of the attacker. Experiments evaluated on the ImageNet dataset demonstrate that the proposed attacker can generate adversarial images for most samples in limited steps, greatly reducing the accuracy of the model.
AB - The Internet of Things (IoT) ecosystem that integrates a wide variety of intelligent multimedia applications and services has undergone a tremendous transformation over the years. As an essential approach for securing IoT-based multimedia services, Artificial Intelligence (AI) has been in innovation at a rapid pace. However, many machine learning systems are vulnerable to adversarial examples, including advanced deep neural networks. By making imperceptible modifications to real examples, the prediction will deviate far from the correct value. A deep reinforcement learning-based black-box attacker on the image classification model is introduced in our research. Different from the existing black-box attacks, which require massive queries and trials in the pixel space, the proposed method compresses the images into latent space through variational inference, querying the optimal examples efficiently with actor-critic networks. Rather than patch-to-patch translation with generative adversarial networks in related works, the fake examples are generated by gradually superimposing the perturbations into the latent space at each step through the Markov decision process (MDP) to high stability and good astringency of the attacker. Experiments evaluated on the ImageNet dataset demonstrate that the proposed attacker can generate adversarial images for most samples in limited steps, greatly reducing the accuracy of the model.
KW - Actor-critic
KW - Adversarial examples
KW - Black-box attacks
KW - Deep reinforcement learning
KW - Policy gradient
UR - http://www.scopus.com/inward/record.url?scp=85145976913&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85145976913&partnerID=8YFLogxK
U2 - 10.1016/j.ins.2023.01.019
DO - 10.1016/j.ins.2023.01.019
M3 - Article
AN - SCOPUS:85145976913
SN - 0020-0255
VL - 624
SP - 624
EP - 638
JO - Information Sciences
JF - Information Sciences
ER -