Dual Language Models for Code Mixed Speech Recognition

11/03/2017
by   Saurabh Garg, et al.
0

In this work, we present a new approach to language modeling for bilingual code-switched text. This technique, called dual language models, involves building two complementary monolingual language models and combining them using a probabilistic model for switching between the two. The objective of this technique is to improve generalization when the amount of code-switched training data is limited. We evaluate the efficacy of our approach using a conversational Mandarin-English speech corpus. Using our model, we obtain significant improvements in both perplexity measures and automatic speech recognition error rates compared to a standard bilingual language model.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset