b'Hengshuai Yao'

research

∙ 08/22/2023

Careful at Estimation and Bold at Exploration

Exploration strategies in continuous action space are often heuristic du...

0 Xing Chen, et al. ∙

research

∙ 08/18/2023

Baird Counterexample Is Solved: with an example of How to Debug a Two-time-scale Algorithm

Baird counterexample was proposed by Leemon Baird in 1995, first used to...

0 Hengshuai Yao, et al. ∙

research

∙ 07/29/2023

A new Gradient TD Algorithm with only One Step-size: Convergence Rate Analysis using L-λ Smoothness

Gradient Temporal Difference (GTD) algorithms (Sutton et al., 2008, 2009...

0 Hengshuai Yao, et al. ∙

research

∙ 11/25/2022

The Vanishing Decision Boundary Complexity and the Strong First Component

We show that unlike machine learning classifiers, there are no complex b...

0 Hengshuai Yao, et al. ∙

research

∙ 10/31/2022

Class Interference of Deep Neural Networks

Recognizing and telling similar objects apart is even hard for human bei...

0 Dongcui Diao, et al. ∙

research

∙ 05/20/2022

Sigmoidally Preconditioned Off-policy Learning:a new exploration method for reinforcement learning

One of the major difficulties of reinforcement learning is learning from...

0 Xing Chen, et al. ∙

research

∙ 04/01/2022

Learning to Accelerate by the Methods of Step-size Planning

Gradient descent is slow to converge for ill-conditioned problems and no...

0 Hengshuai Yao, et al. ∙

research

∙ 12/21/2021

Explainable Artificial Intelligence for Autonomous Driving: A Comprehensive Overview and Field Guide for Future Research Directions

Autonomous driving has achieved a significant milestone in research and ...

0 Shahin Atakishiyev, et al. ∙

research

∙ 11/20/2021

Towards safe, explainable, and regulated autonomous driving

There has been growing interest in the development and deployment of aut...

0 Shahin Atakishiyev, et al. ∙

research

∙ 09/17/2021

Exploring the Robustness of Distributional Reinforcement Learning against Noisy State Observations

In real scenarios, state observations that an agent observes may contain...

7 Ke Sun, et al. ∙

research

∙ 01/21/2021

Breaking the Deadly Triad with a Target Network

The deadly triad refers to the instability of a reinforcement learning a...

0 Shangtong Zhang, et al. ∙

research

∙ 09/14/2020

Variance-Reduced Off-Policy Memory-Efficient Policy Search

Off-policy policy optimization is a challenging problem in reinforcement...

0 Daoming Lyu, et al. ∙

research

∙ 07/19/2020

Beyond Prioritized Replay: Sampling States in Model-Based RL via Simulated Priorities

Model-based reinforcement learning (MBRL) can significantly improve samp...

12 Jincheng Mei, et al. ∙

research

∙ 07/07/2020

Towards a practical measure of interference for reinforcement learning

Catastrophic interference is common in many network-based learning syste...

28 Vincent Liu, et al. ∙

research

∙ 01/26/2020

Weakly Supervised Few-shot Object Segmentation using Co-Attention with Visual and Semantic Inputs

Significant progress has been made recently in developing few-shot objec...

9 Mennatullah Siam, et al. ∙

research

∙ 12/18/2019

One-Shot Weakly Supervised Video Object Segmentation

Conventional few-shot object segmentation methods learn object segmentat...

0 Mennatullah Siam, et al. ∙

research

∙ 11/11/2019

Provably Convergent Off-Policy Actor-Critic with Function Approximation

We present the first provably convergent off-policy actor-critic algorit...

26 Shangtong Zhang, et al. ∙

research

∙ 11/08/2019

Mapless Navigation among Dynamics with Social-safety-awareness: a reinforcement learning approach from 2D laser scans

We propose a method to tackle the problem of mapless collision-avoidance...

0 Jun Jin, et al. ∙

research

∙ 10/03/2019

Is Fast Adaptation All You Need?

Gradient-based meta-learning has proven to be highly effective at learni...

24 Khurram Javed, et al. ∙

research

∙ 06/18/2019

Hill Climbing on Value Estimates for Search-control in Dyna

Dyna is an architecture for model-based reinforcement learning (RL), whe...

0 Yangchen Pan, et al. ∙

research

∙ 05/13/2019

Distributional Reinforcement Learning for Efficient Exploration

In distributional reinforcement learning (RL), the estimated distributio...

0 Borislav Mavrin, et al. ∙

research

∙ 03/20/2019

Reinforcing Classical Planning for Adversary Driving Scenarios

Adversary scenarios in driving, where the other vehicles may make mistak...

0 Nazmus Sakib, et al. ∙

research

∙ 03/18/2019

Deep Reinforcement Learning with Decorrelation

Learning an effective representation for high-dimensional data is a chal...

0 Borislav Mavrin, et al. ∙

research

∙ 11/06/2018

ACE: An Actor Ensemble Algorithm for Continuous Control with Tree Search

In this paper, we propose an actor ensemble algorithm, named ACE, for co...

0 Shangtong Zhang, et al. ∙

research

∙ 11/05/2018

QUOTA: The Quantile Option Architecture for Reinforcement Learning

In this paper, we propose the Quantile Option Architecture (QUOTA) for e...

0 Shangtong Zhang, et al. ∙

research

∙ 04/27/2018

Negative Log Likelihood Ratio Loss for Deep Neural Network Classification

In deep neural network, the cross-entropy loss function is commonly used...

0 Donglai Zhu, et al. ∙

research

∙ 02/08/2018

Practical Issues of Action-conditioned Next Image Prediction

The problem of action-conditioned image prediction is to predict the exp...

0 Donglai Zhu, et al. ∙

Hengshuai Yao

Featured Co-authors

Sign in with Google

Consider DeepAI Pro