Towards Adversarial Robustness Via Compact Feature Representations
Muhammad Shah, Raphael Olivier, Bhiksha Raj
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:15:16
Deep Neural Networks (DNNs), while providing state-of-the-art performance in a wide variety of tasks, have been shown to be vulnerable to adversarial attacks. A popular hypothesis is that DNNs are vulnerable because they operate over a grossly overspecified input space with very sparse human supervision due to which DNNs tend to use spurious input features that humans tend to ignore. This makes the latter a likely attack vector for the adversary. It is reasonable to expect that reducing the size of the feature representation in a way that does not harm generalization would discard spurious features before discarding perceptually relevant features. To explore this hypothesis, we take non-robust pretrained models, use existing and novel techniques to shrink the feature representation in various ways, and then evaluate the robustness of the models using an array of popular adversarial attack methods. We find that after the size of the feature representation has been reduced, the models do become more robust to adversarial attacks. In addition to being more robust, models with compact feature representations have the benefit of being more resource efficient.
Chairs:
George Atia