Inverting the Feature Visualization Process for Feedforward Neural Networks
This work sheds light on the invertibility of feature visualization in neural networks. Since the input that is generated by feature visualization using activation maximization does, in general, not yield the feature objective it was optimized for, we investigate optimizing for the feature objective that yields this input. Given the objective function used in activation maximization that measures how closely a given input resembles the feature objective, we exploit that the gradient of this function w.r.t. inputs is—up to a scaling factor—linear in the objective. This observation is used to find the optimal feature objective via computing a closed form solution that minimizes the gradient. By means of Inverse Feature Visualization, we intend to provide an alternative view on a networks sensitivity to certain inputs that considers feature objectives rather than activations.
READ FULL TEXT