Going beyond stochastic gradient descent (SGD), what new phenomena emerg...
We show that taking the width and depth to infinity in a deep neural net...
Hyperparameter (HP) tuning in deep learning is an expensive process,
pro...
Most recent progress in natural language understanding (NLU) has been dr...
We analyze the learning dynamics of infinitely wide neural networks with...
Yang (2020a) recently showed that the Neural Tangent Kernel (NTK) at
ini...
As its width tends to infinity, a deep neural network's behavior under
g...
In a neural network (NN), weight matrices linearly transform inputs
into...
We prove that a randomly initialized neural network of *any architecture...
Robustness against image perturbations bounded by a ℓ_p ball have been
w...
We present a method for provably defending any pretrained image classifi...
Randomized smoothing is a recently proposed defense against adversarial
...
Wide neural networks with random weights and biases are Gaussian process...
Function classes are collections of Boolean functions on a finite set, w...
Are neural networks biased toward simple functions? Does depth always he...
Recent works have shown the effectiveness of randomized smoothing as a
s...
Verification of neural networks enables us to gauge their robustness aga...
Verification of neural networks enables us to gauge their robustness aga...
We develop a mean field theory for batch normalization in fully-connecte...
Several recent trends in machine learning theory and practice, from the
...
Interactive Fiction (IF) games are complex textual decision making probl...
Training recurrent neural networks (RNNs) on long sequence tasks is plag...
We study randomly initialized residual networks using mean field theory ...
External neural memory structures have recently become a popular tool fo...
Following the recent trend in explicit neural memory structures, we pres...