Algorithms and Hardware for Efficient Processing of Logic-based Neural Networks
Recent efforts to improve the performance of neural network (NN) accelerators that meet today's application requirements have given rise to a new trend of logic-based NN inference relying on fixed-function combinational logic (FFCL). This paper presents an innovative optimization methodology for compiling and mapping NNs utilizing FFCL into a logic processor. The presented method maps FFCL blocks to a set of Boolean functions where Boolean operations in each function are mapped to high-performance, low-latency, parallelized processing elements. Graph partitioning and scheduling algorithms are presented to handle FFCL blocks that cannot straightforwardly fit the logic processor. Our experimental evaluations across several datasets and NNs demonstrate the superior performance of our framework in terms of the inference throughput compared to prior art NN accelerators. We achieve 25x higher throughput compared with the XNOR-based accelerator for VGG16 model that can be amplified 5x deploying the graph partitioning and merging algorithms.
READ FULL TEXT