MULTI-LAYER FEATURE DIVISION TRANSFERABLE ADVERSARIAL ATTACK
Zikang Jin (Nanjing University of Aeronautics and Astronautics); Changchun Yin (Nanjing University of Aeronautics and Astronautics); Piji Li (Nanjing University of Aeronautics and Astronautics); Lu Zhou (Nanjing University of aeronautics and astronautics); Liming Fang (Nanjing University of Aeronautics and Astronautics); Xiangmao Chang (Nanjing University of Aeronautics and Astronautics); Zhe Liu (Nanjing University of Aeronautics and Astronautics)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
Improving the transferability of adversarial examples for the purpose of attacking unknown black-box models has been intensively studied. In particular, feature-level transfer-based attacks, which destroy the intermediate feature outputs of source models, are proven to generate more transferable adversarial examples. However, existing state-of-the-art feature-level attacks only destroy a single intermediate layer, this severely limits the transferability of adversarial examples. And all of these attacks have a vague distinction between positive and negative features. By contrast, we propose the Multi-layer Feature Division Attack (MFDA), which aggregates multi-layer feature information on the basis of feature division to attack. Extensive experimental evaluation demonstrates that MFDA can significantly boost the adversarial transferability and quantitatively distinguish the effects of positive and negative features on transferability. Compared to the state-of-the-art feature-level attacks, our improvement methods with MFDA increase the average success rate by 2.8% against normally trained models and 3.0% against adversarially trained models.