A Greedy Algorithm for Quantizing Neural Networks

10/29/2020
by   Eric Lybrand, et al.
0

We propose a new computationally efficient method for quantizing the weights of pre-trained neural networks that is general enough to handle both multi-layer perceptrons and convolutional neural networks. Our method deterministically quantizes layers in an iterative fashion with no complicated re-training required. Specifically, we quantize each neuron, or hidden unit, using a greedy path-following algorithm. This simple algorithm is equivalent to running a dynamical system, which we prove is stable for quantizing a single-layer neural network (or, alternatively, for quantizing the first layer of a multi-layer network) when the training data are Gaussian. We show that under these assumptions, the quantization error decays with the width of the layer, i.e., its level of over-parametrization. We provide numerical experiments, on multi-layer networks, to illustrate the performance of our methods on MNIST and CIFAR10 data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/20/2023

SPFQ: A Stochastic Algorithm and Its Error Analysis for Neural Network Quantization

Quantization is a widely used compression method that effectively reduce...
research
09/07/2022

A simple approach for quantizing neural networks

In this short note, we propose a new method for quantizing the weights o...
research
02/09/2017

Energy Saving Additive Neural Network

In recent years, machine learning techniques based on neural networks fo...
research
06/10/2020

Training with Multi-Layer Embeddings for Model Reduction

Modern recommendation systems rely on real-valued embeddings of categori...
research
03/04/2021

Clusterability in Neural Networks

The learned weights of a neural network have often been considered devoi...
research
01/30/2017

CNN as Guided Multi-layer RECOS Transform

There is a resurging interest in developing a neural-network-based solut...

Please sign up or login with your details

Forgot password? Click here to reset