Yihao Feng

research

∙ 08/11/2023

BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents

The massive successes of large language models (LLMs) encourage the emer...

0 Zhiwei Liu, et al. ∙

research

∙ 08/04/2023

Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization

Recent months have seen the emergence of a powerful new trend in which l...

0 Weiran Yao, et al. ∙

research

∙ 07/18/2023

REX: Rapid Exploration and eXploitation for AI Agents

In this paper, we propose an enhanced approach for Rapid Exploration and...

0 Rithesh Murthy, et al. ∙

research

∙ 06/06/2023

FAMO: Fast Adaptive Multitask Optimization

One of the grand enduring goals of AI is to create generalist agents tha...

0 Bo Liu, et al. ∙

research

∙ 06/05/2023

LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning

Lifelong learning offers a promising paradigm of building a generalist a...

0 Bo Liu, et al. ∙

research

∙ 06/01/2023

Preference-grounded Token-level Guidance for Language Model Fine-tuning

Aligning language models (LMs) with preferences is an important problem ...

0 Shentao Yang, et al. ∙

research

∙ 05/18/2023

UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild

Achieving machine autonomy and human control often represent divergent o...

0 Can Qin, et al. ∙

research

∙ 03/16/2023

HIVE: Harnessing Human Feedback for Instructional Visual Editing

Incorporating human feedback has been shown to be crucial to align text ...

1 Shu Zhang, et al. ∙

research

∙ 02/20/2023

Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning for Task-oriented Dialogue Systems

When learning task-oriented dialogue (ToD) agents, reinforcement learnin...

0 Yihao Feng, et al. ∙

research

∙ 10/12/2022

A Unified Framework for Alternating Offline Model Training and Policy Learning

In offline model-based reinforcement learning (offline MBRL), we learn a...

0 Shentao Yang, et al. ∙

research

∙ 08/17/2022

Metric Residual Networks for Sample Efficient Goal-Conditioned Reinforcement Learning

Goal-conditioned reinforcement learning (GCRL) has a wide range of poten...

7 Bo Liu, et al. ∙

research

∙ 06/14/2022

Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning

Offline reinforcement learning (RL) extends the paradigm of classical RL...

0 Shentao Yang, et al. ∙

research

∙ 02/19/2022

A Regularized Implicit Policy for Offline Reinforcement Learning

Offline reinforcement learning enables learning from a fixed dataset, wi...

0 Shentao Yang, et al. ∙

research

∙ 01/01/2022

Operator Deep Q-Learning: Zero-Shot Reward Transferring in Reinforcement Learning

Reinforcement learning (RL) has drawn increasing interests in recent yea...

4 Ziyang Tang, et al. ∙

research

∙ 06/02/2021

Unsupervised Out-of-Domain Detection via Pre-trained Transformers

Deployed real-world machine learning applications are often subject to u...

0 Keyang Xu, et al. ∙

research

∙ 04/24/2021

Incremental Few-shot Text Classification with Multi-round New Classes: Formulation, Dataset and System

Text classification is usually studied by labeling natural language text...

0 Congying Xia, et al. ∙

research

∙ 03/09/2021

Non-asymptotic Confidence Intervals of Off-policy Evaluation: Primal and Dual Bounds

Off-policy evaluation (OPE) is the task of estimating the expected rewar...

4 Yihao Feng, et al. ∙

research

∙ 10/29/2020

Off-Policy Interval Estimation with Lipschitz Value Iteration

Off-policy evaluation provides an essential tool for evaluating the effe...

8 Ziyang Tang, et al. ∙

research

∙ 08/15/2020

Accountable Off-Policy Evaluation With Kernel Bellman Statistics

We consider off-policy evaluation (OPE), which evaluates the performance...

4 Yihao Feng, et al. ∙

research

∙ 10/16/2019

Doubly Robust Bias Reduction in Infinite Horizon Off-Policy Estimation

Infinite horizon off-policy policy evaluation is a highly challenging ta...

0 Ziyang Tang, et al. ∙

research

∙ 05/25/2019

A Kernel Loss for Solving the Bellman Equation

Value function learning plays a central role in many state-of-the-art re...

0 Yihao Feng, et al. ∙

research

∙ 09/29/2018

Knowledge-guided Semantic Computing Network

It is very useful to integrate human knowledge and experience into tradi...

0 Guangming Shi, et al. ∙

research

∙ 10/30/2017

Sample-efficient Policy Optimization with Stein Control Variate

Policy gradient methods have achieved remarkable successes in solving ch...

0 Hao Liu, et al. ∙

research

∙ 07/20/2017

Learning to Draw Samples with Amortized Stein Variational Gradient Descent

We propose a simple algorithm to train stochastic neural networks to dra...

0 Yihao Feng, et al. ∙

Yihao Feng

Featured Co-authors

Sign in with Google

Consider DeepAI Pro