Coaxing out desired behavior from pretrained models, while avoiding
unde...
We present QLoRA, an efficient finetuning approach that reduces memory u...
Large language models can perform new tasks in a zero-shot fashion, give...
Likelihood, although useful as a training loss, is a poor search objecti...
We present the results of the NLP Community Metasurvey. Run from May to ...
Large language models (LMs) are able to in-context learn – perform a new...
We introduce a new domain expert mixture (DEMix) layer that enables
cond...
We propose PIGLeT: a model that learns physical commonsense knowledge th...
Image captioning has conventionally relied on reference-based automatic
...
Large language models have shown promising results in zero-shot settings...
We study conversational dialog in which there are many possible response...
Pretrained Language Models (LMs) generate text with remarkable quality,
...
Successful linguistic communication relies on a shared experience of the...
There is a fundamental gap between how humans understand and use languag...
The principle of the Information Bottleneck (Tishby et al. 1999) is to
p...
Counterfactual reasoning requires predicting how alternative events, con...
Abductive reasoning is inference to the most plausible explanation. For
...
Humans understand language based on the rich background knowledge about ...
We introduce Cooperative Generator-Discriminator Networks (Co-opNet), a
...
Recent progress in natural language generation has raised dual-use conce...
Recent work by Zellers et al. (2018) introduced a new task of commonsens...
Despite considerable advancements with deep neural language models, the
...
We present FAST NAVIGATOR, a general framework for action decoding, whic...
Recurrent Neural Networks (RNNs) are powerful autoregressive sequence mo...
We present Sounding Board, a social chatbot that won the 2017 Amazon Ale...
Understanding procedural language requires anticipating the causal effec...