Exploring Spoken Named Entity Recognition: A Cross-Lingual Perspective

07/03/2023
by   Moncef Benaicha, et al.
0

Recent advancements in Named Entity Recognition (NER) have significantly improved the identification of entities in textual data. However, spoken NER, a specialized field of spoken document retrieval, lags behind due to its limited research and scarce datasets. Moreover, cross-lingual transfer learning in spoken NER has remained unexplored. This paper utilizes transfer learning across Dutch, English, and German using pipeline and End-to-End (E2E) schemes. We employ Wav2Vec2-XLS-R models on custom pseudo-annotated datasets and investigate several architectures for the adaptability of cross-lingual systems. Our results demonstrate that End-to-End spoken NER outperforms pipeline-based alternatives over our limited annotations. Notably, transfer learning from German to Dutch surpasses the Dutch E2E system by 7 Dutch pipeline system by 4 transfer learning in spoken NER but also sets promising outcomes for future evaluations, hinting at the need for comprehensive data collection to augment the results.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset