b'Robin Jia'

research

∙ 05/24/2023

How Predictable Are Large Language Model Capabilities? A Case Study on BIG-bench

We investigate the predictability of large language model (LLM) capabili...

0 Qinyuan Ye, et al. ∙

research

∙ 05/24/2023

Chain-of-Questions Training with Latent Answers for Robust Multistep Question Answering

We train a language model (LM) to robustly answer multistep questions by...

0 Wang Zhu, et al. ∙

research

∙ 05/24/2023

Estimating Large Language Model Capabilities without Labeled Test Data

Large Language Models (LLMs) have exhibited an impressive ability to per...

16 Harvey Yiyun Fu, et al. ∙

research

∙ 05/13/2023

SCENE: Self-Labeled Counterfactuals for Extrapolating to Negative Examples

Detecting negatives (such as non-entailment relationships, unanswerable ...

0 Deqing Fu, et al. ∙

research

∙ 12/20/2022

Careful Data Curation Stabilizes In-context Learning

In-context learning (ICL) enables large language models (LLMs) to perfor...

0 Ting-Yun Chang, et al. ∙

research

∙ 11/28/2022

CoNAL: Anticipating Outliers with Large Language Models

In many task settings, text classification models are likely to encounte...

0 Albert Xu, et al. ∙

research

∙ 10/26/2022

Generalization Differences between End-to-End and Neuro-Symbolic Vision-Language Reasoning Systems

For vision-and-language reasoning tasks, both fully connectionist, end-t...

0 Wang Zhu, et al. ∙

research

∙ 10/13/2022

Benchmarking Long-tail Generalization with Likelihood Splits

In order to reliably process natural language, NLP systems must generali...

0 Ameya Godbole, et al. ∙

research

∙ 10/12/2022

Are Sample-Efficient NLP Models More Robust?

Recent work has observed that pre-trained models have higher out-of-dist...

0 Nelson F. Liu, et al. ∙

research

∙ 02/22/2022

Knowledge Base Question Answering by Case-based Reasoning over Subgraphs

Question answering (QA) over real-world knowledge bases (KBs) is challen...

7 Rajarshi Das, et al. ∙

research

∙ 12/16/2021

Models in the Loop: Aiding Crowdworkers with Generative Annotation Assistants

In Dynamic Adversarial Data Collection (DADC), human annotators are task...

0 Max Bartolo, et al. ∙

research

∙ 10/16/2021

Analyzing Dynamic Adversarial Training Data in the Limit

To create models that are robust across a wide range of test inputs, tra...

0 Eric Wallace, et al. ∙

research

∙ 06/15/2021

Question Answering Infused Pre-training of General-Purpose Contextualized Representations

This paper proposes a pre-training objective based on question answering...

0 Robin Jia, et al. ∙

research

∙ 06/08/2021

Swords: A Benchmark for Lexical Substitution with Improved Data Coverage and Quality

We release a new benchmark for lexical substitution, the task of finding...

0 Mina Lee, et al. ∙

research

∙ 05/26/2021

The statistical advantage of automatic NLG metrics at the system level

Estimating the expected output quality of generation systems is central ...

0 Johnny Tian-Zheng Wei, et al. ∙

research

∙ 04/18/2021

Improving Question Answering Model Robustness with Synthetic Adversarial Data Generation

Despite the availability of very large datasets and pretrained models, s...

0 Max Bartolo, et al. ∙

research

∙ 04/14/2021

Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little

A possible explanation for the impressive performance of masked language...

7 Koustuv Sinha, et al. ∙

research

∙ 02/01/2021

Can Small and Synthetic Benchmarks Drive Modeling Innovation? A Retrospective Study of Question Answering Modeling Approaches

Datasets are not only resources for training accurate, deployable system...

0 Nelson F. Liu, et al. ∙

research

∙ 12/30/2020

Human Evaluation of Spoken vs. Visual Explanations for Open-Domain QA

While research on explaining predictions of open-domain QA systems (ODQA...

0 Ana Valeria Gonzalez, et al. ∙

research

∙ 12/24/2020

To what extent do human explanations of model behavior align with actual model behavior?

Given the increasingly prominent role NLP models (will) play in our live...

9 Grusha Prasad, et al. ∙

research

∙ 10/13/2020

With Little Power Comes Great Responsibility

Despite its importance to experimental design, statistical power (the pr...

5 Dallas Card, et al. ∙

research

∙ 10/10/2020

On the Importance of Adaptive Data Collection for Extremely Imbalanced Pairwise Tasks

Many pairwise classification tasks, such as paraphrase detection and ope...

0 Stephen Mussmann, et al. ∙

research

∙ 06/16/2020

Selective Question Answering under Domain Shift

To avoid giving wrong answers, question answering (QA) models need to kn...

0 Amita Kamath, et al. ∙

research

∙ 05/04/2020

Robust Encodings: A Framework for Combating Adversarial Typos

Despite excellent performance on many tasks, NLP systems are easily fool...

0 Erik Jones, et al. ∙

research

∙ 10/22/2019

MRQA 2019 Shared Task: Evaluating Generalization in Reading Comprehension

We present the results of the Machine Reading for Question Answering (MR...

0 Adam Fisch, et al. ∙

research

∙ 09/03/2019

Certified Robustness to Adversarial Word Substitutions

State-of-the-art NLP models can often be fooled by adversaries that appl...

0 Robin Jia, et al. ∙

research

∙ 04/04/2019

Document-Level N-ary Relation Extraction with Multiscale Representation Learning

Most information extraction methods focus on binary relations expressed ...

0 Robin Jia, et al. ∙

research

∙ 06/11/2018

Know What You Don't Know: Unanswerable Questions for SQuAD

Extractive reading comprehension systems can often locate the correct an...

0 Pranav Rajpurkar, et al. ∙

research

∙ 04/17/2018

Delete, Retrieve, Generate: A Simple Approach to Sentiment and Style Transfer

We consider the task of text attribute transfer: transforming a sentence...

0 Juncen Li, et al. ∙

research

∙ 07/23/2017

Adversarial Examples for Evaluating Reading Comprehension Systems

Standard accuracy metrics indicate that reading comprehension systems ar...

0 Robin Jia, et al. ∙

research

∙ 06/11/2016

Data Recombination for Neural Semantic Parsing

Modeling crisp logical regularities is crucial in semantic parsing, maki...

0 Robin Jia, et al. ∙

Robin Jia

Featured Co-authors

Sign in with Google

Consider DeepAI Pro