A scaling law refers to the observation that the test performance of a m...
Bilevel optimization has various applications such as hyper-parameter
op...
Recent progress was made in characterizing the generalization error of
g...
Theoretical properties of bilevel problems are well studied when the
low...
Reinforcement learning (RL) has exceeded human performance in many synth...
Studies on benign overfitting provide insights for the success of
overpa...
Most existing analyses of (stochastic) gradient descent rely on the cond...
Determining whether saddle points exist or are approximable for
nonconve...
It is a well-known fact that nonconvex optimization is computationally
i...
Federated Learning (FL) coordinates with numerous heterogeneous devices ...
We provide a first-order oracle complexity lower bound for finding stati...
We study multi-objective reinforcement learning (RL) where an agent's re...
The label shift problem refers to the supervised learning setting where ...
We investigate stochastic optimization problems under relaxed assumption...
We provide the first non-asymptotic analysis for finding stationary
poin...
While stochastic gradient descent (SGD) is still the de facto algorithm ...
We provide a theoretical explanation for the fast convergence of gradien...
The exposure bias problem refers to the training-inference discrepancy c...
Both generative adversarial network models and variational autoencoders ...
This project report compares some known GAN and VAE models proposed prio...
We study smooth stochastic optimization problems on Riemannian manifolds...
We study gradient-based optimization methods obtained by directly
discre...