MicroExpNet: An Extremely Small and Fast Model For Expression Recognition From Frontal Face Images
This paper is aimed at creating extremely small and fast convolutional neural networks (CNN) for the problem of facial expression recognition (FER) from frontal face images. We show that, for this problem, translation invariance (achieved through max-pooling layers) degrades performance, especially when the network is small, and that the knowledge distillation method can be used to obtain extremely compressed CNNs. Extensive comparisons on two widely-used FER datasets, CK+ and Oulu-CASIA, demonstrate that our largest model sets the new state-of-the-art by yielding 1.8 previous best results, on CK+ and Oulu-CASIA datasets, respectively. In addition, our smallest model (MicroExpNet), obtained using knowledge distillation, is less than 1MB in size and works at 1408 frames per second on an Intel i7 CPU. Being slightly less accurate than our largest model, MicroExpNet still achieves a 8.3 dataset, over the previous state-of-the-art, much larger network; and on the CK+ dataset, it performs on par with a previous state-of-the-art network but with 154x fewer parameters.
READ FULL TEXT