Language models still struggle on moral reasoning, despite their impress...
Recent research has shown that language models exploit `artifacts' in
be...
Mathematical reasoning skills are essential for general-purpose intellig...
Pretrained Transformers (PT) have been shown to improve Out of Distribut...
Evaluation of models on benchmarks is unreliable without knowing the deg...
Several benchmarks have been built with heavy investment in resources to...
With the increasing importance of safety requirements associated with th...
When answering a question, humans utilize the information available acro...
Controlling the text generated by language models and customizing the co...
Table Question Answering (TQA) is an important but under-explored task. ...
Large Language Models (LMs) have achieved state-of-the-art performance o...
Curriculum learning strategies in prior multi-task learning approaches
a...
In recent years, progress in NLU has been driven by benchmarks. These
be...
Single-task models have proven pivotal in solving specific tasks; howeve...
Given the ubiquitous nature of numbers in text, reasoning with numbers t...
Recently introduced instruction-paradigm empowers non-expert users to
le...
Despite the success of large pre-trained language models (LMs) such as C...
Data modification, either via additional training datasets, data
augment...
Even though deep neural models have achieved superhuman performance on m...
Knowledge of questions' difficulty level helps a teacher in several ways...
In order to equip NLP systems with selective prediction capability, seve...
How can model designers turn task instructions into effective prompts fo...
As systems like smart grid continue to become complex on a daily basis,
...
Standard NLP tasks do not incorporate several common real-world scenario...
Deep Learning's outstanding track record across several domains has stem...
Models that top leaderboards often perform unsatisfactorily when deploye...
Following procedural texts written in natural languages is challenging. ...
Can we enable NLP models to appropriately respond to instructional promp...
In order to make AI systems more reliable and their adoption in safety
c...
A `state of the art' model A surpasses humans in a benchmark B, but fail...
Models that surpass human performance on several popular benchmarks disp...
Numerical reasoning is often important to accurately understand the worl...
Neural language models have achieved human level performance across seve...
DARPA and Allen AI have proposed a collection of datasets to encourage
r...