BlendNet: Design and Optimization of a Neural Network-Based Inference Engine Blending Binary and Fixed-Point Convolutions

07/07/2023
by   Arash Fayyazi, et al.
0

This paper presents BlendNet, a neural network architecture employing a novel building block called Blend module, which relies on performing binary and fixed-point convolutions in its main and skip paths, respectively. There is a judicious deployment of batch normalizations on both main and skip paths inside the Blend module and in between consecutive Blend modules. This paper also presents a compiler for mapping various BlendNet models obtained by replacing some blocks/modules in various vision neural network models with BlendNet modules to FPGA devices with the goal of minimizing the end-to-end inference latency while achieving high output accuracy. BlendNet-20, derived from ResNet-20 trained on the CIFAR-10 dataset, achieves 88.0 accuracy (0.8 only takes 0.38ms to process each image (1.4x faster than state-of-the-art). Similarly, our BlendMixer model trained on the CIFAR-10 dataset achieves 90.6 accuracy (1.59 reduction in the model size. Moreover, The reconfigurability of DSP blocks for performing 48-bit bitwise logic operations is utilized to achieve low-power FPGA implementation. Our measurements show that the proposed implementation yields 2.5x lower power consumption.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset