We investigate the predictability of large language model (LLM) capabili...
We train a language model (LM) to robustly answer multistep questions by...
Large Language Models (LLMs) have exhibited an impressive ability to per...
Detecting negatives (such as non-entailment relationships, unanswerable
...
In-context learning (ICL) enables large language models (LLMs) to perfor...
In many task settings, text classification models are likely to encounte...
For vision-and-language reasoning tasks, both fully connectionist, end-t...
In order to reliably process natural language, NLP systems must generali...
Recent work has observed that pre-trained models have higher
out-of-dist...
Question answering (QA) over real-world knowledge bases (KBs) is challen...
In Dynamic Adversarial Data Collection (DADC), human annotators are task...
To create models that are robust across a wide range of test inputs, tra...
This paper proposes a pre-training objective based on question answering...
We release a new benchmark for lexical substitution, the task of finding...
Estimating the expected output quality of generation systems is central ...
Despite the availability of very large datasets and pretrained models,
s...
A possible explanation for the impressive performance of masked language...
Datasets are not only resources for training accurate, deployable system...
While research on explaining predictions of open-domain QA systems (ODQA...
Given the increasingly prominent role NLP models (will) play in our live...
Despite its importance to experimental design, statistical power (the
pr...
Many pairwise classification tasks, such as paraphrase detection and
ope...
To avoid giving wrong answers, question answering (QA) models need to kn...
Despite excellent performance on many tasks, NLP systems are easily fool...
We present the results of the Machine Reading for Question Answering (MR...
State-of-the-art NLP models can often be fooled by adversaries that appl...
Most information extraction methods focus on binary relations expressed
...
Extractive reading comprehension systems can often locate the correct an...
We consider the task of text attribute transfer: transforming a sentence...
Standard accuracy metrics indicate that reading comprehension systems ar...
Modeling crisp logical regularities is crucial in semantic parsing, maki...