As the complexity and computational demands of deep learning models rise...
This paper presents BlendNet, a neural network architecture employing a ...
Model compression has become the de-facto approach for optimizing the
ef...
Recent efforts to improve the performance of neural network (NN) acceler...
Token pruning has emerged as an effective solution to speed up the infer...
The overheads of classical decoding for quantum error correction on
supe...
Recent efforts for improving the performance of neural network (NN)
acce...
This paper introduces the sparse periodic systolic (SPS) dataflow, which...
While there is a large body of research on efficient processing of deep
...
Machine learning models differ in terms of accuracy, computational/memor...