The Lookahead optimizer improves the training stability of deep neural
n...
Adaptive gradient-based optimizers, particularly Adam, have left their m...
We approach the problem of improving robustness of deep learning algorit...
Among attempts at giving a theoretical account of the success of deep ne...
We propose a unifying view to analyze the representation quality of
self...
Attention is a powerful component of modern neural networks across a wid...
We approach the problem of implicit regularization in deep learning from...
Attention is a powerful component of modern neural networks across a wid...
We revisit the bias-variance tradeoff for neural networks in light of mo...
It is well known that over-parametrized deep neural networks (DNNs) are ...
We argue that the estimation of the mutual information between high
dime...
Recent research showed that deep neural networks are highly sensitive to...