We fine-tune GPT-3 to answer long-form questions using a text-based
web-...
State-of-the-art language models can match human performance on many tas...
The NeurIPS 2020 Procgen Competition was designed as a centralized bench...
We identify empirical scaling laws for the cross-entropy loss in four
do...
Recent work has demonstrated substantial gains on many NLP tasks and
ben...
In this report, we introduce Procgen Benchmark, a suite of 16 procedural...
In this report, we present a new reinforcement learning (RL) benchmark b...