We examine online safe multi-agent reinforcement learning using constrai...
While single-agent policy optimization in a fixed environment has attrac...
Embedding learning has found widespread applications in recommendation
s...
Tremendous success of machine learning (ML) and the unabated growth in M...
Data Parallelism (DP) and Model Parallelism (MP) are two common paradigm...
Temporal-Difference (TD) learning with nonlinear smooth function
approxi...
We study constrained online convex optimization, where the constraints
c...
The purpose of this thesis is to develop new theories on high-dimensiona...
We consider online learning for episodic Markov decision processes (MDPs...
We study the Safe Reinforcement Learning (SRL) problem using the Constra...
We study the robust one-bit compressed sensing problem whose goal is to
...
We consider a distributed multi-agent policy evaluation problem in
reinf...
We propose a new primal-dual homotopy smoothing algorithm for a linearly...
This paper introduces a general regularized thresholded least-square
pro...
Let Y be a d-dimensional random vector with unknown mean μ and
covarianc...
This paper considers online convex optimization (OCO) with stochastic
co...