Interpolated Adversarial Training: Achieving Robust Neural Networks without Sacrificing Accuracy
Adversarial robustness has become a central goal in deep learning, both in theory and practice. However, successful methods to improve adversarial robustness (such as adversarial training) greatly hurt generalization performance on the clean data. This could have a major impact on how adversarial robustness affects real world systems (i.e. many may opt to forego robustness if it can improve performance on the clean data). We propose Interpolated Adversarial Training, which employs recently proposed interpolation based training methods in the framework of adversarial training. On CIFAR-10, adversarial training increases clean test error from 5.8 16.7 robustness while achieving a clean test error of only 6.5 the relative error increase for the robust model is reduced from 187.9 12.1
READ FULL TEXT