Backdoor Attack with Sample-Specific Triggers

12/07/2020
by   Yuezun Li, et al.
0

Recently, backdoor attacks pose a new security threat to the training process of deep neural networks (DNNs). Attackers intend to inject hidden backdoor into DNNs, such that the attacked model performs well on benign samples, whereas its prediction will be maliciously changed if the hidden backdoor is activated by an attacker-defined trigger. Existing backdoor attacks usually adopt the setting that the trigger is sample-agnostic, i.e., different poisoned samples contain the same trigger, resulting in that the attacks could be easily mitigated by current backdoor defenses. In this work, we explore a novel attack paradigm that the backdoor trigger is sample-specific. Specifically, inspired by the recent advance in DNN-based image steganography, we generate sample-specific invisible additive noises as backdoor triggers by encoding an attacker-specified string into benign images through an encoder-decoder network. The mapping from the string to the target label will be generated when DNNs are trained on the poisoned dataset. Extensive experiments on benchmark datasets verify the effectiveness of our method in attacking models with or without defenses.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset