emLam -- a Hungarian Language Modeling baseline

01/26/2017
by   Dávid Márk Nemeskey, et al.
0

This paper aims to make up for the lack of documented baselines for Hungarian language modeling. Various approaches are evaluated on three publicly available Hungarian corpora. Perplexity values comparable to models of similar-sized English corpora are reported. A new, freely downloadable Hungar- ian benchmark corpus is introduced.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset