Identifying Computer-Translated Paragraphs using Coherence Features

12/28/2018
by   Hoang-Quoc Nguyen-Son, et al.
0

We have developed a method for extracting the coherence features from a paragraph by matching similar words in its sentences. We conducted an experiment with a parallel German corpus containing 2000 human-created and 2000 machine-translated paragraphs. The result showed that our method achieved the best performance (accuracy = 72.3 compared with previous methods on various computer-generated text including translation and paper generation (best accuracy = 67.9 32.0 resource one (Japanese) attained similar performances. It demonstrated the efficiency of the coherence features at distinguishing computer-translated from human-created paragraphs on diverse languages.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset