We expose a surprising failure of generalization in auto-regressive larg...
We aim to better understand the emergence of `situational awareness' in ...
Forecasting future world events is a challenging but valuable task. Fore...
We show that a GPT-3 model can learn to express uncertainty about its ow...
In many contexts, lying – the use of verbal falsehoods to deceive – is
h...
We propose a benchmark to measure whether a language model is truthful i...
Active reinforcement learning (ARL) is a variant on reinforcement learni...
This article is about the cognitive science of visual art. Artists creat...
Before deploying autonomous agents in the real world, we need to be conf...
Active Reinforcement Learning (ARL) is a twist on RL where the agent obs...
This report surveys the landscape of potential security threats from
mal...
AI systems are increasingly applied to complex tasks that involve intera...
Advances in artificial intelligence (AI) will transform modern life by
r...
Providing Reinforcement Learning agents with expert advice can dramatica...
An important use of machine learning is to learn what people value. What...