Reject Illegal Inputs with Generative Classifier Derived from Any Discriminative Classifier

01/02/2020

∙

Generative classifiers have been shown promising to detect illegal inputs including adversarial examples and out-of-distribution samples. Supervised Deep Infomax (SDIM) is a scalable end-to-end framework to learn generative classifiers. In this paper, we propose a modification of SDIM termed SDIM-logit. Instead of training generative classifier from scratch, SDIM-logit first takes as input the logits produced any given discriminative classifier, and generate logit representations; then a generative classifier is derived by imposing statistical constraints on logit representations. SDIM-logit could inherit the performance of the discriminative classifier without loss. SDIM-logit incurs a negligible number of additional parameters, and can be efficiently trained with base classifiers fixed. We perform classification with rejection, where test samples whose class conditionals are smaller than pre-chosen thresholds will be rejected without predictions. Experiments on illegal inputs, including adversarial examples, samples with common corruptions, and out-of-distribution (OOD) samples show that allowed to reject a portion of test samples, SDIM-logit significantly improves the performance on the left test sets.

READ FULL TEXT

Reject Illegal Inputs with Generative Classifier Derived from Any Discriminative Classifier

Sign in with Google

Consider DeepAI Pro