The stochastic gradient descent (SGD) algorithm is the algorithm we use ...
A fundamental open problem in deep learning theory is how to define and
...
Prevention of complete and dimensional collapse of representations has
r...
Restricted Boltzmann Machines (RBMs) offer a versatile architecture for
...
This work reports deep-learning-unique first-order and second-order phas...
This work theoretically studies stochastic neural networks, a main type ...
It has been recognized that heavily overparameterized deep neural networ...
Stochastic gradient descent (SGD) has been deployed to solve highly
non-...
Despite the empirical success of the deep Q network (DQN) reinforcement
...
Stochastic gradient descent (SGD) undergoes complicated multiplicative n...
The noise in stochastic gradient descent (SGD), caused by minibatch samp...
As a simple and efficient optimization method in deep learning, stochast...
Recent studies have demonstrated that noise in stochastic gradient desce...
Previous literature offers limited clues on how to learn a periodic func...
It has been recognized that a heavily overparameterized artificial neura...
We propose a novel regularization method, called volumization, for
neura...
Learning in the presence of label noise is a challenging yet important t...
Identifying a divergence problem in Adam, we propose a new optimizer, La...
We generalize a standard benchmark of reinforcement learning, the classi...
We deal with the selective classification problem
(supervised-learning p...