Large language models (LLMs) demonstrate remarkable machine translation ...
Masked Language Modeling (MLM) has been one of the most prominent approa...
Large multilingual language models typically rely on a single vocabulary...
Large-scale generative models show an impressive ability to perform a wi...
Generative models of code, pretrained on large corpora of programs, have...
Recently, there has been a surge of interest in the NLP community on the...
Current efficient fine-tuning methods (e.g., adapters, prefix-tuning, et...
Mined bitexts can contain imperfect translations that yield unreliable
t...
Multilingual neural machine translation (MNMT) learns to translate multi...
Many commonsense reasoning NLP tasks involve choosing between one or mor...
Current abstractive summarization systems outperform their extractive
co...
Semantic parsing using sequence-to-sequence models allows parsing of dee...
Neural sequence models can generate highly fluent sentences but recent
s...
Models pretrained with self-supervised objectives on large text corpora
...
We introduce a very deep and light-weight transformer, DeLighT, that del...
We introduce MARGE, a pre-trained sequence-to-sequence model learned wit...
There has been recent success in pre-training on monolingual data and
fi...
Non-autoregressive machine translation models significantly speed up dec...
The recently proposed mask-predict decoding algorithm has narrowed the
p...
This paper demonstrates that multilingual denoising pre-training produce...
State-of-the-art neural machine translation models generate a translatio...
We present BART, a denoising autoencoder for pretraining sequence-to-seq...
Given a rough, word-by-word gloss of a source language sentence, target
...
Most machine translation systems generate text autoregressively, by
sequ...
We consider the problem of making machine translation more robust to
cha...
Neural network models are capable of generating extremely natural soundi...