Diffusion Probabilistic Models (DPMs) have achieved considerable success...
Coordinate denoising is a promising 3D molecular pre-training method, wh...
We study a kind of new SDE that was arisen from the research on optimiza...
Geometric deep learning enables the encoding of physical symmetries in
m...
Recently, generalization on out-of-distribution (OOD) data with correlat...
Stochastic partial differential equations (SPDEs) are significant tools ...
The momentum acceleration technique is widely adopted in many optimizati...
Learning dynamics governed by differential equations is crucial for
pred...
It is arguably believed that flatter minima can generalize better. Howev...
Batch normalization (BN) has become a crucial component across diverse d...
In this paper, we provide a unified analysis of the excess risk of the m...
Stochastic gradient descent (SGD) and its variants are mainstream method...
Based on basis path set, G-SGD algorithm significantly outperforms
conve...
It is well known that the historical logs are used for evaluating and
le...
It was empirically confirmed by Keskar et al.SharpMinima that flatter
mi...
In reinforcement learning (RL) , one of the key components is policy
eva...
Q-learning is one of the most popular methods in Reinforcement Learning ...
Asynchronous stochastic gradient descent (ASGD) is a popular parallel
op...
When using stochastic gradient descent to solve large-scale machine lear...
Many machine learning tasks can be formulated as Regularized Empirical R...