The expressive power of graph neural networks is usually measured by
com...
The simple idea that not all things are equally difficult has surprising...
As learning machines increase their influence on decisions concerning hu...
Existing works show that although modern neural networks achieve remarka...
Discriminative self-supervised learning allows training models on any ra...
Does everyone equally benefit from computer vision systems? Answers to t...
Vision Transformers (ViT) have recently emerged as a powerful alternativ...
Convolutional architectures have proven extremely successful for vision
...
One of the central features of deep learning is the generalization abili...
A recent line of research has highlighted the existence of a double desc...
The gradient noise (GN) in the stochastic gradient descent (SGD) algorit...
Despite the phenomenal success of deep neural networks in a broad range ...
The gradient noise (GN) in the stochastic gradient descent (SGD) algorit...
We provide a description for the evolution of the generalization perform...
We argue that in fully-connected networks a phase transition delimits th...
Deep learning has been immensely successful at a variety of tasks, rangi...
We publicly release a new large-scale dataset, called SearchQA, for mach...
Machine learning techniques are being increasingly used as flexible
non-...
We look at the eigenvalues of the Hessian of a loss function before and ...
This paper proposes a new optimization algorithm called Entropy-SGD for
...
The authors present empirical distributions for the halting time (measur...
Finding minima of a real valued non-convex function over a high dimensio...