Owain Evans

research

∙ 09/21/2023

The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"

We expose a surprising failure of generalization in auto-regressive larg...

0 Lukas Berglund, et al. ∙

research

∙ 09/01/2023

Taken out of context: On measuring situational awareness in LLMs

We aim to better understand the emergence of `situational awareness' in ...

0 Lukas Berglund, et al. ∙

research

∙ 06/30/2022

Forecasting Future World Events with Neural Networks

Forecasting future world events is a challenging but valuable task. Fore...

0 Andy Zou, et al. ∙

research

∙ 05/28/2022

Teaching Models to Express Their Uncertainty in Words

We show that a GPT-3 model can learn to express uncertainty about its ow...

0 Stephanie Lin, et al. ∙

research

∙ 10/13/2021

Truthful AI: Developing and governing AI that does not lie

In many contexts, lying – the use of verbal falsehoods to deceive – is h...

0 Owain Evans, et al. ∙

research

∙ 09/08/2021

TruthfulQA: Measuring How Models Mimic Human Falsehoods

We propose a benchmark to measure whether a language model is truthful i...

0 Stephanie Lin, et al. ∙

research

∙ 11/13/2020

Active Reinforcement Learning: Observing Rewards at a Cost

Active reinforcement learning (ARL) is a variant on reinforcement learni...

0 David Krueger, et al. ∙

research

∙ 11/16/2019

Sensory Optimization: Neural Networks as a Model for Understanding and Creating Art

This article is about the cognitive science of visual art. Artists creat...

39 Owain Evans, et al. ∙

research

∙ 07/02/2019

Generalizing from a few environments in safety-critical reinforcement learning

Before deploying autonomous agents in the real world, we need to be conf...

7 Zachary Kenton, et al. ∙

research

∙ 03/13/2018

Active Reinforcement Learning with Monte-Carlo Tree Search

Active Reinforcement Learning (ARL) is a twist on RL where the agent obs...

0 Sebastian Schulze, et al. ∙

research

∙ 02/20/2018

The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation

This report surveys the landscape of potential security threats from mal...

3 Miles Brundage, et al. ∙

research

∙ 07/17/2017

Trial without Error: Towards Safe Reinforcement Learning via Human Intervention

AI systems are increasingly applied to complex tasks that involve intera...

0 William Saunders, et al. ∙

research

∙ 05/24/2017

When Will AI Exceed Human Performance? Evidence from AI Experts

Advances in artificial intelligence (AI) will transform modern life by r...

0 Katja Grace, et al. ∙

research

∙ 01/15/2017

Agent-Agnostic Human-in-the-Loop Reinforcement Learning

Providing Reinforcement Learning agents with expert advice can dramatica...

0 David Abel, et al. ∙

research

∙ 12/18/2015

Learning the Preferences of Ignorant, Inconsistent Agents

An important use of machine learning is to learn what people value. What...

0 Owain Evans, et al. ∙

Owain Evans

Featured Co-authors

Sign in with Google

Consider DeepAI Pro