Self-supervised pre-training of language models usually consists in
pred...
We introduce SONAR, a new multilingual and multimodal fixed-size sentenc...
The representation degeneration problem is a phenomenon that is widely
o...
Recent advances in natural language processing (NLP) have led to the
dev...
Recent advances in NLP have significantly improved the performance of
la...
Existing approaches for unsupervised bilingual lexicon induction (BLI) o...
One of the major challenges of machine translation (MT) is ambiguity, wh...
Static subword tokenization algorithms have been an essential component ...
We present SpeechMatrix, a large-scale multilingual corpus of
speech-to-...
Finding word boundaries in continuous speech is challenging as there is
...
In text summarization and simplification, system outputs must be evaluat...
We present a new approach to perform zero-shot cross-modal transfer betw...
We introduce a simple neural encoder architecture that can be trained us...
We introduce dGSLM, the first "textless" model able to generate audio sa...
Recent work in spoken language modeling shows the possibility of learnin...
Language models for historical states of language are becoming increasin...
The need for raw large raw corpora has dramatically increased in recent ...
What are the units of text that we want to model? From bytes to multi-wo...
Recent impressive improvements in NLP, largely based on the success of
c...
Automatic evaluation remains an open research question in Natural Langua...
With the success of large-scale pre-training and multilingual modeling i...
Multilingual pretrained language models have demonstrated remarkable
zer...
Transfer learning based on pretraining language models on a large amount...
Coupled with the availability of large scale datasets, deep learning
arc...
Speech embeddings are fixed-size acoustic representations of variable-le...
We use the multilingual OSCAR corpus, extracted from Common Crawl via
la...
The French TreeBank developed at the University Paris 7 is the main sour...
In order to simplify a sentence, human editors perform multiple rewritin...
Progress in Sentence Simplification has been hindered by the lack of
sup...
Building natural language processing systems for non standardized and lo...
LSTMs have proven very successful at language modeling. However, it rema...
Pretrained language models are now ubiquitous in Natural Language Proces...
Text simplification aims at making a text easier to read and understand ...
The evaluation of text simplification (TS) systems remains an open chall...
Morphosyntactic lexicons and word vector representations have both prove...