Extended Bit-Plane Compression for Convolutional Neural Network Accelerators

10/01/2018
by   Lukas Cavigelli, et al.
0

After the tremendous success of convolutional neural networks in image classification, object detection, speech recognition, etc., there is now rising demand for deployment of these compute-intensive ML models on tightly power constrained embedded and mobile systems at low cost as well as for pushing the throughput in data centers. This has triggered a wave of research towards specialized hardware accelerators. Their performance is often constrained by I/O bandwidth and the energy consumption is dominated by I/O transfers to off-chip memory. We introduce and evaluate a novel, hardware-friendly compression scheme for the feature maps present within convolutional neural networks. We show that an average compression ratio of 4.4x relative to uncompressed data and a gain of 60 ResNet-34 with a compression block requiring <300 bit of sequential cells and minimal combinational logic.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset