b'Mikel Artetxe'

research

∙ 09/06/2023

Gender-specific Machine Translation with Large Language Models

Decoder-only Large Language Models (LLMs) have demonstrated potential in...

0 Eduardo Sanchez, et al. ∙

research

∙ 08/31/2023

The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants

We present Belebele, a multiple-choice machine reading comprehension (MR...

0 Lucas Bandarkar, et al. ∙

research

∙ 08/23/2023

Evaluation of Faithfulness Using the Longest Supported Subsequence

As increasingly sophisticated language models emerge, their trustworthin...

0 Anirudh Mittal, et al. ∙

research

∙ 08/02/2023

Do Multilingual Language Models Think Better in English?

Translate-test is a popular technique to improve the performance of mult...

0 Julen Etxaniz, et al. ∙

research

∙ 07/03/2023

Improving Language Plasticity via Pretraining with Active Forgetting

Pretrained language models (PLMs) are today the primary model for natura...

0 Yihong Chen, et al. ∙

research

∙ 05/23/2023

Revisiting Machine Translation for Cross-lingual Classification

Machine Translation (MT) has been widely used for cross-lingual classifi...

0 Mikel Artetxe, et al. ∙

research

∙ 12/20/2022

Mini-Model Adaptation: Efficiently Extending Pretrained Models to New Languages via Aligned Shallow Training

Prior work has shown that it is possible to expand pretrained Masked Lan...

0 Kelly Marchisio, et al. ∙

research

∙ 12/20/2022

On the Role of Parallel Data in Cross-lingual Transfer Learning

While prior work has established that the use of parallel data is conduc...

0 Machel Reid, et al. ∙

research

∙ 12/19/2022

Training Trajectories of Language Models Across Scales

Scaling up language models has led to unprecedented performance gains, b...

0 Mengzhou Xia, et al. ∙

research

∙ 10/26/2022

Don't Prompt, Search! Mining-based Zero-Shot Learning with Language Models

Masked language models like BERT can perform text classification in a ze...

0 Mozes van de Kar, et al. ∙

research

∙ 05/30/2022

Prompting ELECTRA: Few-Shot Learning with Discriminative Pre-Trained Models

Pre-trained masked language models successfully perform few-shot learnin...

0 Mengzhou Xia, et al. ∙

research

∙ 05/24/2022

Principled Paraphrase Generation with Parallel Corpora

Round-trip Machine Translation (MT) is a popular choice for paraphrase g...

0 Aitor Ormazabal, et al. ∙

research

∙ 05/24/2022

PoeLM: A Meter- and Rhyme-Controllable Language Model for Unsupervised Poetry Generation

Formal verse poetry imposes strict constraints on the meter and rhyme sc...

0 Aitor Ormazabal, et al. ∙

research

∙ 05/24/2022

On the Role of Bidirectionality in Language Model Pre-Training

Prior work on language model pre-training has explored different archite...

0 Mikel Artetxe, et al. ∙

research

∙ 05/22/2022

Multilingual Machine Translation with Hyper-Adapters

Multilingual machine translation suffers from negative interference acro...

0 Christos Baziotis, et al. ∙

research

∙ 05/12/2022

Lifting the Curse of Multilinguality by Pre-training Modular Transformers

Multilingual pre-trained models are known to suffer from the curse of mu...

4 Jonas Pfeiffer, et al. ∙

research

∙ 05/02/2022

OPT: Open Pre-trained Transformer Language Models

Large language models, which are often trained for hundreds of thousands...

8 Susan Zhang, et al. ∙

research

∙ 03/15/2022

Does Corpus Quality Really Matter for Low-Resource Languages?

The vast majority of non-English corpora are derived from automatically ...

0 Mikel Artetxe, et al. ∙

research

∙ 03/14/2022

Efficient Language Modeling with Sparse all-MLP

All-MLP architectures have attracted increasing interest as an alternati...

7 Ping Yu, et al. ∙

research

∙ 02/25/2022

Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?

Large language models (LMs) are able to in-context learn – perform a new...

2 Sewon Min, et al. ∙

research

∙ 12/20/2021

Efficient Large Scale Language Modeling with Mixtures of Experts

Mixture of Experts layers (MoEs) enable efficient scaling of language mo...

10 Mikel Artetxe, et al. ∙

research

∙ 12/20/2021

Few-shot Learning with Multilingual Language Models

Large-scale autoregressive language models such as GPT-3 are few-shot le...

8 Xi Victoria Lin, et al. ∙

research

∙ 08/04/2021

PARADISE: Exploiting Parallel Data for Multilingual Sequence-to-Sequence Pretraining

Despite the success of multilingual sequence-to-sequence pretraining, mo...

0 Machel Reid, et al. ∙

research

∙ 05/21/2021

Unsupervised Multilingual Sentence Embeddings for Parallel Corpus Mining

Existing models of multilingual sentence embeddings require large parall...

0 Ivana Kvapilíková, et al. ∙

research

∙ 03/23/2021

Multilingual Autoregressive Entity Linking

We present mGENRE, a sequence-to-sequence system for the Multilingual En...

0 Nicola De Cao, et al. ∙

research

∙ 12/31/2020

Beyond Offline Mapping: Learning Cross Lingual Word Embeddings through Context Anchoring

Recent research on cross-lingual word embeddings has been dominated by u...

0 Aitor Ormazabal, et al. ∙

research

∙ 05/29/2020

Training Multilingual Machine Translation by Alternately Freezing Language-Specific Encoders-Decoders

We propose a modular architecture of language-specific encoder-decoders ...

0 Carlos Escolano, et al. ∙

research

∙ 04/30/2020

A Call for More Rigor in Unsupervised Cross-lingual Learning

We review motivations, definition, approaches, and methodology for unsup...

6 Mikel Artetxe, et al. ∙

research

∙ 04/14/2020

Multilingual Machine Translation: Closing the Gap between Shared and Language-specific Encoder-Decoders

State-of-the-art multilingual machine translation relies on a universal ...

0 Carlos Escolano, et al. ∙

research

∙ 04/09/2020

Translation Artifacts in Cross-lingual Transfer Learning

Both human and machine translation play a central role in cross-lingual ...

0 Mikel Artetxe, et al. ∙

research

∙ 02/28/2020

Do all Roads Lead to Rome? Understanding the Role of Initialization in Iterative Back-Translation

Back-translation provides a simple yet effective approach to exploit mon...

0 Mikel Artetxe, et al. ∙

research

∙ 10/25/2019

On the Cross-lingual Transferability of Monolingual Representations

State-of-the-art unsupervised multilingual models (e.g., multilingual BE...

14 Mikel Artetxe, et al. ∙

research

∙ 07/24/2019

Bilingual Lexicon Induction through Unsupervised Machine Translation

A recent research line has obtained strong results on bilingual lexicon ...

0 Mikel Artetxe, et al. ∙

research

∙ 06/12/2019

Analyzing the Limitations of Cross-lingual Word Embedding Mappings

Recent research in cross-lingual word embeddings has almost exclusively ...

0 Aitor Ormazabal, et al. ∙

research

∙ 02/04/2019

An Effective Approach to Unsupervised Machine Translation

While machine translation has traditionally relied on large amounts of p...

0 Mikel Artetxe, et al. ∙

research

∙ 12/26/2018

Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond

We introduce an architecture to learn joint multilingual sentence repres...

0 Mikel Artetxe, et al. ∙

research

∙ 11/03/2018

Margin-based Parallel Corpus Mining with Multilingual Sentence Embeddings

Machine translation is highly sensitive to the size and quality of the t...

0 Mikel Artetxe, et al. ∙

research

∙ 09/06/2018

Uncovering divergent linguistic information in word embeddings with lessons for intrinsic and extrinsic evaluation

Following the recent success of word embeddings, it has been argued that...

0 Mikel Artetxe, et al. ∙

research

∙ 09/04/2018

Unsupervised Statistical Machine Translation

While modern machine translation has relied on large parallel corpora, a...

0 Mikel Artetxe, et al. ∙

research

∙ 05/16/2018

A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings

Recent work has managed to learn cross-lingual word embeddings without p...

0 Mikel Artetxe, et al. ∙

research

∙ 10/30/2017

Unsupervised Neural Machine Translation

In spite of the recent success of neural machine translation (NMT) in st...

0 Mikel Artetxe, et al. ∙

Mikel Artetxe

Featured Co-authors

Sign in with Google

Consider DeepAI Pro