The rapid advances of large language models (LLMs), such as ChatGPT, are...
Hyperparameter optimization, also known as hyperparameter tuning, is a w...
The extraordinary capabilities of large language models (LLMs) such as
C...
In this paper, we study the implicit regularization of stochastic gradie...
In 2023, the International Conference on Machine Learning (ICML) require...
Multilayer neural networks have achieved superhuman performance in many
...
Classical algorithms are often not effective for solving nonconvex
optim...
Alice (owner) has knowledge of the underlying quality of her items measu...
Many modern machine learning algorithms are composed of simple private
a...
Algorithmic fairness plays an important role in machine learning and imp...
Commonsense causality reasoning (CCR) aims at identifying plausible caus...
To advance deep learning methodologies in the next decade, a theoretical...
I consider the setting where reviewers offer very noisy scores for a num...
Understanding the training dynamics of deep learning models is perhaps a...
Neural collapse is a highly symmetric geometric pattern of neural networ...
In this paper, we introduce Target-Aware Weighted Training (TAWT), a wei...
Sorted l1 regularization has been incorporated into many methods for sol...
Being able to efficiently and accurately select the top-k elements witho...
In this rejoinder, we aim to address two broad issues that cover most
co...
Perhaps the single most important use case for differential privacy is t...
In this paper, we introduce the Layer-Peeled Model, a nonconvex yet
anal...
Classical approaches in learning theory are often seen to yield very loo...
As a popular approach to modeling the dynamics of training overparametri...
Hard parameter sharing for multi-task learning is widely used in empiric...
An acknowledged weakness of neural networks is their vulnerability to
ad...
In a linear model with possibly many predictors, we consider variable
se...
A fundamental problem in the high-dimensional regression is to understan...
In high-dimensional linear regression, would increasing effect sizes alw...
The learning rate is perhaps the single most important parameter in the
...
Datasets containing sensitive information are often sequentially analyze...
Deep learning models are often trained on datasets that contain sensitiv...
This paper presents a phenomenon in neural networks that we refer to as
...
Differential privacy has seen remarkable success as a rigorous and pract...
We study first-order optimization methods obtained by discretizing ordin...
This paper introduces the FDR-linking theorem, a novel technique for
und...
Gradient-based optimization algorithms can be studied from the perspecti...
Differential privacy provides a rigorous framework for privacy-preservin...
Drawing statistical inferences from large datasets in a model-robust way...