Reject Illegal Inputs with Generative Classifier Derived from Any Discriminative Classifier

01/02/2020
by   Xin Wang, et al.
0

Generative classifiers have been shown promising to detect illegal inputs including adversarial examples and out-of-distribution samples. Supervised Deep Infomax (SDIM) is a scalable end-to-end framework to learn generative classifiers. In this paper, we propose a modification of SDIM termed SDIM-logit. Instead of training generative classifier from scratch, SDIM-logit first takes as input the logits produced any given discriminative classifier, and generate logit representations; then a generative classifier is derived by imposing statistical constraints on logit representations. SDIM-logit could inherit the performance of the discriminative classifier without loss. SDIM-logit incurs a negligible number of additional parameters, and can be efficiently trained with base classifiers fixed. We perform classification with rejection, where test samples whose class conditionals are smaller than pre-chosen thresholds will be rejected without predictions. Experiments on illegal inputs, including adversarial examples, samples with common corruptions, and out-of-distribution (OOD) samples show that allowed to reject a portion of test samples, SDIM-logit significantly improves the performance on the left test sets.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset