In federated frequency estimation (FFE), multiple clients work together ...
Recent research has observed that in machine learning optimization, grad...
We consider a continual learning (CL) problem with two linear regression...
This paper considers the problem of learning a single ReLU neuron with
s...
We study linear regression under covariate shift, where the marginal
dis...
Stochastic gradient descent (SGD) has achieved great success due to its
...
Stochastic gradient descent (SGD) has been demonstrated to generalize we...
For the problem of task-agnostic reinforcement learning (RL), an agent f...
Stochastic gradient descent (SGD) exhibits strong algorithmic regulariza...
Preventing catastrophic forgetting while continually learning new tasks ...
There is an increasing realization that algorithmic inductive biases are...
In this paper we consider multi-objective reinforcement learning where t...
Understanding the algorithmic regularization effect of stochastic gradie...
Regularization for optimization is a crucial technique to avoid overfitt...
The randomness in Stochastic Gradient Descent (SGD) is considered to pla...
The ever-increasing size of modern datasets combined with the difficulty...
Understanding the generalization of deep learning has raised lots of con...