Attacking the Madry Defense Model with L_1-based Adversarial Examples
The Madry Lab recently hosted a competition designed to test the robustness of their adversarially trained MNIST model. Attacks were constrained to perturb each pixel of the input image by a scaled maximal L_∞ distortion ϵ = 0.3. This discourages the use of attacks which are not optimized on the L_∞ distortion metric. Our experimental results demonstrate that by relaxing the L_∞ constraint of the competition, the elastic-net attack to deep neural networks (EAD) can generate transferable adversarial examples which, despite their high average L_∞ distortion, have minimal visual distortion. These results call into question the use of L_∞ as a sole measure for visual distortion, and further demonstrate the power of EAD at generating robust adversarial examples.
READ FULL TEXT