The recently proposed stochastic Polyak stepsize (SPS) and stochastic
li...
State-of-the-art federated learning algorithms such as FedAvg require
ca...
In federated learning, data heterogeneity is a critical challenge. A
str...
Distributed and federated learning algorithms and techniques associated
...
Stochastic Gradient Descent (SGD) algorithms are widely used in optimizi...
Gradient clipping is a popular modification to standard (stochastic) gra...
Gradient tracking (GT) is an algorithm designed for solving decentralize...
Data heterogeneity across clients is a key challenge in federated learni...
We study the asynchronous stochastic gradient descent algorithm for
dist...
Decentralized learning provides an effective framework to train machine
...
Non-convex optimization problems are ubiquitous in machine learning,
esp...
We consider decentralized machine learning over a network where the trai...
Uncertainty estimation (UE) techniques – such as the Gaussian process (G...
Personalization in federated learning can improve the accuracy of a mode...
Federated learning is a powerful distributed learning scheme that allows...
In decentralized machine learning, workers compute model updates on thei...
We consider federated learning (FL), where the training data is distribu...
Data augmentation is a widely adopted technique for avoiding overfitting...
For deploying deep learning models to lower end devices, it is necessary...
We consider decentralized stochastic variational inequalities where the
...
It has been experimentally observed that the efficiency of distributed
t...
Decentralized training of deep learning models enables on-device learnin...
Decentralized training of deep learning models is a key element for enab...
Decentralized optimization methods enable on-device training of machine
...
Lossy gradient compression, with either unbiased or biased compressors, ...
Federated learning is a challenging optimization problem due to the
hete...
We analyze the complexity of biased stochastic gradient methods (SGD), w...
Deep neural networks often have millions of parameters. This can hinder ...
Federated Learning (FL) is a machine learning setting where many devices...
Deep learning networks are typically trained by Stochastic Gradient Desc...
Decentralized stochastic optimization methods have gained a lot of atten...
We study local SGD (also known as parallel SGD and federated averaging),...
Federated learning (FL) is a machine learning setting where many clients...
Federated learning is a key scenario in modern large-scale machine learn...
We analyze (stochastic) gradient descent (SGD) with delayed updates on s...
Decentralized training of deep learning models is a key element for enab...
In this note we give a simple proof for the convergence of stochastic
gr...
We consider decentralized stochastic optimization with the objective fun...
Sign-based algorithms (e.g. signSGD) have been proposed as a biased grad...
Coordinate descent with random coordinate selection is the current state...
Huge scale machine learning problems are nowadays tackled by distributed...
Mini-batch stochastic gradient methods are the current state of the art ...
We show that Newton's method converges globally at a linear rate for
obj...
Mini-batch stochastic gradient descent (SGD) is the state of the art in ...
In recent years, many variance reduced algorithms for empirical risk
min...
Two popular examples of first-order optimization methods over linear spa...