Timothy A. Mann

research

∙ 07/24/2018

Learning from Delayed Outcomes with Intermediate Observations

Optimizing for long term value is desirable in many practical applicatio...

4 Timothy A. Mann, et al. ∙

research

∙ 03/11/2018

Soft-Robust Actor-Critic Policy-Gradient

Robust Reinforcement Learning aims to derive an optimal behavior that ac...

0 Esther Derman, et al. ∙

research

∙ 03/05/2018

Optimizing Slate Recommendations via Slate-CVAE

The slate recommendation problem aims to find the "optimal" ordering of ...

0 Ray Jiang, et al. ∙

research

∙ 02/09/2018

Learning Robust Options

Robust reinforcement learning aims to produce policies that have strong ...

0 Daniel J. Mankowitz, et al. ∙

research

∙ 12/30/2016

Adaptive Lambda Least-Squares Temporal Difference Learning

Temporal Difference learning or TD(λ) is a fundamental algorithm in the ...

0 Timothy A. Mann, et al. ∙

research

∙ 02/10/2016

Adaptive Skills, Adaptive Partitions (ASAP)

We introduce the Adaptive Skills, Adaptive Partitions (ASAP) framework t...

0 Daniel J. Mankowitz, et al. ∙

research

∙ 02/10/2016

Iterative Hierarchical Optimization for Misspecified Problems (IHOMP)

For complex, high-dimensional Markov Decision Processes (MDPs), it may b...

0 Daniel J. Mankowitz, et al. ∙

research

∙ 06/11/2015

Bootstrapping Skills

The monolithic approach to policy representation in Markov Decision Proc...

0 Daniel J. Mankowitz, et al. ∙

research

∙ 04/16/2015

Actively Learning to Attract Followers on Twitter

Twitter, a popular social network, presents great opportunities for on-l...

0 Nir Levine, et al. ∙

Timothy A. Mann

Featured Co-authors

Sign in with Google

Consider DeepAI Pro