This paper applies an idea of adaptive momentum for the nonlinear conjug...
The explicit low-rank regularization, e.g., nuclear norm regularization,...
Transformers have achieved remarkable success in sequence modeling and b...
Learning neural ODEs often requires solving very stiff ODE systems, prim...
Proper orthogonal decomposition (POD) allows reduced-order modeling of
c...
We propose GLassoformer, a novel and efficient transformer architecture
...
We propose near-optimal overlay networks based on d-regular expander gra...
Heavy ball momentum is crucial in accelerating (stochastic) gradient-bas...
We present and review an algorithmic and theoretical framework for impro...
We propose heavy ball neural ordinary differential equations (HBNODEs),
...
We propose FMMformers, a class of efficient and flexible transformers
in...
Federated averaging (FedAvg) is a communication efficient algorithm for ...
Graph Laplacian (GL)-based semi-supervised learning is one of the most u...
The stability and generalization of stochastic gradient-based methods pr...
Momentum plays a crucial role in stochastic gradient-based optimization
...
Low dose computed tomography (LDCT) is desirable for both diagnostic ima...
Deep Neural Networks (DNNs) needs to be both efficient and robust for
pr...
Designing deep neural networks is an art that often involves an expensiv...
Federated learning aims to protect data privacy by collaboratively learn...
Deep neural nets (DNNs) compression is crucial for adaptation to mobile
...
Stochastic gradient descent (SGD) with constant momentum and its variant...
As an important Markov Chain Monte Carlo (MCMC) method, stochastic gradi...
Improving the accuracy and robustness of deep neural nets (DNNs) and ada...
Machine learning (ML) models trained by differentially private stochasti...
We study epidemic forecasting on real-world health data by a graph-struc...
Loss functions with a large number of saddle points are one of the main
...
We propose a simple yet powerful ResNet ensemble algorithm which consist...
In this paper, we analyze efficacy of the fast gradient sign method (FGS...
We improve the robustness of deep neural nets to adversarial attacks by ...
We propose a very simple modification of gradient descent and stochastic...
Deep neural networks (DNNs) typically have enough capacity to fit random...
We present a generic framework for spatio-temporal (ST) data modeling,
a...
Though deep neural networks (DNNs) achieve remarkable performances in ma...
Real-time crime forecasting is important. However, accurate prediction o...
Accurate real time crime prediction is a fundamental issue for public sa...