When large language models (LMs) are applied in zero- or few-shot settin...
While large language models (LLMs) are proficient at question-answering ...
Although counterfactual reasoning is a fundamental aspect of intelligenc...
In this paper, we present a novel approach for distilling math word prob...
Despite their unprecedented success, even the largest language models ma...
Like people, LLMs do not always generate the best text for a given gener...
When people think of everyday things like an "egg," they typically have ...
Mathematical reasoning skills are essential for general-purpose intellig...
Figurative language (e.g., "he flew like the wind") is challenging to
un...
Our goal is a question-answering (QA) system that can show how its answe...
Few-shot prompting is a surprisingly powerful way to use Large Language
...
We study the task of prompting large-scale language models to perform
mu...
When answering a question, humans utilize the information available acro...
Our goal is a teachable reasoning system for question-answering (QA), wh...
The instruction learning paradigm – where a model learns to perform new
...
Given the ubiquitous nature of numbers in text, reasoning with numbers t...
Large LMs such as GPT-3, while powerful, are not immune to mistakes, but...
How can an end-user provide feedback if a deployed structured prediction...
To what extent do language models (LMs) build "mental models" of a scene...
How can an end-user provide feedback if a deployed structured prediction...
Many real-world problems require the combined application of multiple
re...
Although pretrained language models (PTLMs) contain significant amounts ...
Despite the successes of pretrained language models, there are still few...
A class of explainable NLP models for reasoning tasks support their deci...
Our goal, in the context of open-domain textual question-answering (QA),...
Although pretrained language models (PTLMs) have been shown to contain
s...
Scripts - standardized event sequences describing typical everyday activ...
We present the ARC-DA dataset, a direct-answer ("open response", "freefo...
Transformers have been shown to emulate logical deduction over natural
l...
Despite the rapid progress in multihop question-answering (QA), models s...
A common approach to solve complex tasks is by breaking them down into s...
We present a new knowledge-base of hasPart relationships, extracted from...
To what extent can a neural network systematically reason over symbolic
...
This paper describes a new technique, called "knowledge patterns", for
h...
Question answering (QA) tasks have been posed using a variety of formats...
We present a new resource for the NLP community, namely a large (3.5M+
s...
AI has long pursued the goal of having systems reason over *explicitly
p...
Composing knowledge from multiple pieces of texts is a key challenge in
...
Multi-hop textual question answering requires combining information from...
Our goal is to better comprehend procedural text, e.g., a paragraph abou...
We introduce WIQA, the first large-scale dataset of "What if..." questio...
We introduce the first open-domain dataset, called QuaRTz, for reasoning...
AI has achieved remarkable mastery over games such as Chess, Go, and Pok...
A key component of successfully reading a passage of text is the ability...
Prior work has demonstrated that question classification (QC), recognizi...
Our goal is procedural text comprehension, namely tracking how the prope...
While in recent years machine learning (ML) based approaches have been t...
Many natural language questions require recognizing and reasoning with
q...
We present a new kind of question answering dataset, OpenBookQA, modeled...
Comprehending procedural text, e.g., a paragraph describing photosynthes...