ERIC: Extracting Relations Inferred from Convolutions
Our main contribution is to show that the behaviour of kernels across multiple layers of a convolutional neural network can be approximated using a logic program. The extracted logic programs yield accuracies that correlate with those of the original model, though with some information loss in particular as approximations of multiple layers are chained together or as lower layers are quantised. We also show that an extracted program can be used as a framework for further understanding the behaviour of CNNs. Specifically, it can be used to identify key kernels worthy of deeper inspection and also identify relationships with other kernels in the form of the logical rules. Finally, we make a preliminary, qualitative assessment of rules we extract from the last convolutional layer and show that kernels identified are symbolic in that they react strongly to sets of similar images that effectively divide output classes into sub-classes with distinct characteristics.
READ FULL TEXT