Amanda Askell

research

∙ 06/28/2023

Towards Measuring the Representation of Subjective Global Opinions in Language Models

Large language models (LLMs) may not equitably represent diverse global ...

0 Esin Durmus, et al. ∙

research

∙ 02/15/2023

The Capacity for Moral Self-Correction in Large Language Models

We test the hypothesis that language models trained with reinforcement l...

0 Deep Ganguli, et al. ∙

research

∙ 12/15/2022

Constitutional AI: Harmlessness from AI Feedback

As AI systems become more capable, we would like to enlist their help to...

0 Yuntao Bai, et al. ∙

research

∙ 11/04/2022

Measuring Progress on Scalable Oversight for Large Language Models

Developing safe and useful general-purpose AI systems will require us to...

0 Samuel R. Bowman, et al. ∙

research

∙ 09/24/2022

In-context Learning and Induction Heads

"Induction heads" are attention heads that implement a simple algorithm ...

8 Catherine Olsson, et al. ∙

research

∙ 08/23/2022

Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned

We describe our early efforts to red team language models in order to si...

0 Deep Ganguli, et al. ∙

research

∙ 07/11/2022

Language Models (Mostly) Know What They Know

We study whether language models can evaluate the validity of their own ...

12 Saurav Kadavath, et al. ∙

research

∙ 04/12/2022

Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

We apply preference modeling and reinforcement learning from human feedb...

2 Yuntao Bai, et al. ∙

research

∙ 03/04/2022

Training language models to follow instructions with human feedback

Making language models bigger does not inherently make them better at fo...

1 Long Ouyang, et al. ∙

research

∙ 02/15/2022

Predictability and Surprise in Large Generative Models

Large-scale pre-training has recently emerged as a technique for creatin...

0 Deep Ganguli, et al. ∙

research

∙ 12/01/2021

A General Language Assistant as a Laboratory for Alignment

Given the broad capabilities of large language models, it should be poss...

11 Amanda Askell, et al. ∙

research

∙ 02/26/2021

Learning Transferable Visual Models From Natural Language Supervision

State-of-the-art computer vision systems are trained to predict a fixed ...

8 Alec Radford, et al. ∙

research

∙ 05/28/2020

Language Models are Few-Shot Learners

Recent work has demonstrated substantial gains on many NLP tasks and ben...

34 Tom B. Brown, et al. ∙

research

∙ 04/15/2020

Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims

With the recent wave of progress in artificial intelligence (AI) has com...

0 Miles Brundage, et al. ∙

research

∙ 08/24/2019

Release Strategies and the Social Impacts of Language Models

Large language models have a range of beneficial uses: they can assist i...

0 Irene Solaiman, et al. ∙

research

∙ 07/10/2019

The Role of Cooperation in Responsible AI Development

In this paper, we argue that competitive pressures could incentivize AI ...

0 Amanda Askell, et al. ∙

Amanda Askell

Featured Co-authors

Sign in with Google

Consider DeepAI Pro