Accelerating Deep Learning with Memcomputing
Restricted Boltzmann machines (RBMs) and their extensions, often called "deep-belief networks", are very powerful neural networks that have found widespread applicability in the fields of machine learning and big data. The standard way to training these models resorts to an iterative unsupervised procedure based on Gibbs sampling, called "contrastive divergence", and additional supervised tuning via back-propagation. However, this procedure has been shown not to follow any gradient and can lead to suboptimal solutions. In this paper, we show a very efficient alternative to contrastive divergence by means of simulations of digital memcomputing machines (DMMs). We test our approach on pattern recognition using the standard MNIST data set of hand-written numbers. DMMs sample very effectively the vast phase space defined by the probability distribution of RBMs over the test sample inputs, and provide a very good approximation close to the optimum. This efficient search significantly reduces the number of generative pre-training iterations necessary to achieve a given level of accuracy in the MNIST data set, as well as a total performance gain over the traditional approaches. In fact, the acceleration of the pre-training achieved by simulating DMMs is comparable to, in number of iterations, the recently reported hardware application of the quantum annealing method on the same network and data set. Notably, however, DMMs perform far better than the reported quantum annealing results in terms of quality of the training. Our approach is agnostic about the connectivity of the network. Therefore, it can be extended to train full Boltzmann machines, and even deep networks at once.
READ FULL TEXT