Despite the dominance and effectiveness of scaling, resulting in large
n...
Pre-trained encoder-only and sequence-to-sequence (seq2seq) models each ...
In recent years, language models have drastically grown in size, and the...
Scaling up weakly-supervised datasets has shown to be highly effective i...
Language model probing is often used to test specific capabilities of th...
This paper presents a systematic overview and comparison of
parameter-ef...
Multi-hop Question Generation is the task of generating questions which
...
Recent advancements in dialogue response selection (DRS) are based on th...
In this work, we demonstrate that multilingual large-scale
sequence-to-s...
Existing question answering (QA) datasets derived from electronic health...
Existing pre-trained transformer analysis works usually focus only on on...
Solving crossword puzzles requires diverse reasoning capabilities, acces...
Machine Learning (ML) systems are getting increasingly popular, and driv...
Transformer-based encoder-decoder models produce a fused token-wise
repr...
Recent advances in deep learning have drastically improved performance o...
Multiple studies have shown that BERT is remarkably robust to pruning, y...
A semantic parsing model is crucial to natural language processing
appli...
Much of the recent success in NLP is due to the large Transformer-based
...
Transformer-based models are now widely used in NLP, but we still do not...
Recent dialogue approaches operate by reading each word in a conversatio...
The Transformer architecture has become increasingly popular over the pa...
We present NarrativeTime, a new timeline-based annotation scheme for tem...
This paper proposes a Transformer-based model to generate equations for ...
BERT-based architectures currently give state-of-the-art performance on ...
There is a growing body of work that proposes methods for mitigating bia...
Generative Adversarial Networks (GANs) have experienced a recent surge i...
We propose a triad-based neural network system that generates affinity s...
In this paper, we present a method for adversarial decomposition of text...
Clinical notes often describe important aspects of a patient's stay and ...
Learning a better representation with neural networks is a challenging
p...
In this paper, we propose to use a set of simple, uniform in architectur...
One of the major goals in automated argumentation mining is to uncover t...
In this work, we present a new dataset for computational humor, specific...
Language generation tasks that seek to mimic human ability to use langua...
We analyze the RI-TIMEXes in temporally annotated corpora and propose tw...