As language models grow ever larger, the need for large-scale high-quali...
We identify the task of measuring data to quantitatively characterize th...
In recent years, large-scale data collection efforts have prioritized th...
The scale, variety, and quantity of publicly-available NLP datasets has ...
Developing documentation guidelines and easy-to-use templates for datase...
We introduce GEM, a living benchmark for natural language Generation (NL...