Ancient-Modern Chinese Translation with a Large Training Dataset

08/11/2018
by   Dayiheng Liu, et al.
0

Ancient Chinese brings the wisdom and spirit culture of the Chinese nation. Automatically translation from ancient Chinese to modern Chinese helps to inherit and carry forward the quintessence of the ancients. In this paper, we propose an Ancient-Modern Chinese clause alignment approach and apply it to create a large scale Ancient-Modern Chinese parallel corpus which contains about 1.24M bilingual pairs. To our best knowledge, this is the first large high-quality Ancient-Modern Chinese dataset. Furthermore, we train the SMT and various NMT based models on this dataset and provide a strong baseline for this task

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset