Stacked Filters Stationary Flow For Hardware-Oriented Acceleration Of Deep Convolutional Neural Networks
To address memory and computation resource limitations for hardware-oriented acceleration of deep convolutional neural networks (CNNs), in this paper we present a computation flow, stacked filters stationary flow (SFS), and a corresponding data encoding format, relative indexed compressed sparse filter format (CSF), and also a three dimensional Single Instruction Multiple Data (3D-SIMD) processor architecture to take full advantage of these two features. Comparing with the state-of-the-art result (Han et al., 2016b), our method achieved 1.11x improvement in reducing the storage required by AlexNet, and 1.09x improvement in reducing the storage required by SqueezeNet, without loss of accuracy on the ImageNet dataset. Moreover, using this approach, chip area for logics handling irregular sparse data access can be saved.
READ FULL TEXT