Large language models (LLMs) may not equitably represent diverse global
...
We test the hypothesis that language models trained with reinforcement
l...
As AI systems become more capable, we would like to enlist their help to...
Developing safe and useful general-purpose AI systems will require us to...
"Induction heads" are attention heads that implement a simple algorithm ...
We describe our early efforts to red team language models in order to
si...
We study whether language models can evaluate the validity of their own
...
We apply preference modeling and reinforcement learning from human feedb...
Making language models bigger does not inherently make them better at
fo...
Large-scale pre-training has recently emerged as a technique for creatin...
Given the broad capabilities of large language models, it should be poss...
State-of-the-art computer vision systems are trained to predict a fixed ...
Recent work has demonstrated substantial gains on many NLP tasks and
ben...
With the recent wave of progress in artificial intelligence (AI) has com...
Large language models have a range of beneficial uses: they can assist i...
In this paper, we argue that competitive pressures could incentivize AI
...