Controlling Text Complexity in Neural Machine Translation

11/03/2019
by   Sweta Agrawal, et al.
0

This work introduces a machine translation task where the output is aimed at audiences of different levels of target language proficiency. We collect a high quality dataset of news articles available in English and Spanish, written for diverse grade levels and propose a method to align segments across comparable bilingual articles. The resulting dataset makes it possible to train multi-task sequence-to-sequence models that translate Spanish into English targeted at an easier reading grade level than the original Spanish. We show that these multi-task models outperform pipeline approaches that translate and simplify text independently.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset