We consider robust empirical risk minimization (ERM), where model parame...
In recommender system or crowdsourcing applications of online learning, ...
Adaptive optimization methods are well known to achieve superior converg...
Policy regret is a well established notion of measuring the performance ...
Reinforcement learning (RL) is empirically successful in complex nonline...
Agents trained by reinforcement learning (RL) often fail to generalize b...
We study derivative-free methods for policy optimization over the class ...
Our goal is for AI systems to correctly identify and act according to th...
For an autonomous system to provide value (e.g., to customers, designers...