Embedding Deep Networks into Visual Explanations

09/15/2017
by   Zhongang Qi, et al.
0

In this paper, we propose a novel explanation module to explain the predictions made by deep learning. Explanation modules work by embedding a high-dimensional deep network layer nonlinearly into a low-dimensional explanation space, while retaining faithfulness in that the original deep learning predictions can be constructed from the few concepts extracted by the explanation module. We then visualize such concepts so that human can learn about the high-level concepts deep learning is using to make decisions. We propose an algorithm called Sparse Reconstruction Autoencoder (SRAE) for learning the embedding to the explanation space, SRAE aims to reconstruct part of the original feature space while retaining faithfulness. A visualization system is then introduced for human understanding of features in the explanation space. The proposed method is applied to explain CNN models in image classification tasks, and several novel metrics are introduced to evaluate the performance of explanations quantitatively without human involvement. Experiments show that the proposed approach could generate better explanations of the mechanisms CNN use for making predictions in the task.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset