Fine-tuning language models (LMs) has yielded success on diverse downstr...
We focus on the task of learning a single index model σ(w^⋆·
x) with res...
One of the central questions in the theory of deep learning is to unders...
Traditional analyses of gradient descent show that when the largest
eige...
Significant theoretical work has established that in specific regimes, n...
In overparametrized models, the noise in stochastic gradient descent (SG...
This work identifies and addresses two important technical challenges in...