N-gram and Neural Language Models for Discriminating Similar Languages

08/11/2017
by   Andre Cianflone, et al.
1

This paper describes our submission (named clac) to the 2016 Discriminating Similar Languages (DSL) shared task. We participated in the closed Sub-task 1 (Set A) with two separate machine learning techniques. The first approach is a character based Convolution Neural Network with a bidirectional long short term memory (BiLSTM) layer (CLSTM), which achieved an accuracy of 78.45 minimal tuning. The second approach is a character-based n-gram model. This last approach achieved an accuracy of 88.45 89.38

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset