Toward Finding The Global Optimal of Adversarial Examples

09/10/2019
by   Zhenxin Xiao, et al.
19

Current machine learning models are vulnerable to adversarial examples (Goodfellow et al., 2014), we noticed that current state-of-the-art methods (Kurakin et al., 2016; Cheng et al., 2018) to attack a well-trained model often stuck in local optimal values. We conduct series of experiments on both white-box and black-box settings, and find out that by different initialization, the attack algorithm will finally converge to very different local optimals, suggesting the importance of careful and thorough search in the attack space. In this paper, we propose a general boosting algorithm that can help current attack to find a more global optimal example. Specifically, we search for the adversarial examples by starting from different points/directions, and in certain interval we adopt successive halving (Jamieson & Talwalkar, 2016) to cut down the searching directions that are not promising, and use Bayesian Optimization (Pelikan et al., 1999; Bergstra et al., 2011) to resample from the search space based on the knowledge obtained from past searches. We demonstrate that by applying our methods to state-of-the-art attack algorithms in both black-and white box setting, we can further reduce the distortion between the original image and adversarial sample about 10 computation cost 5-10 times without harming the final result. We conduct experiments in models trained on MNIST or ImageNet and also try on decision tree models, these experiments suggest that our method is a general way to boost the performance of current adversarial attack methods.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset