Data scarcity is a crucial issue for the development of highly multiling...
We examine whether large neural language models, trained on very large
c...
This paper presents an open-source software library that provides a set ...
Since its original appearance in 1991, the Perso-Arabic script represent...
Ad hoc abbreviations are commonly found in informal communication channe...
This work presents an information-theoretic operationalisation of
cross-...
Psycholinguistic studies of human word processing and lexical access pro...
This paper describes the Dakshina dataset, a new resource consisting of ...
We present methods for calculating a measure of phonotactic complexity—b...
Multilingual Automated Speech Recognition (ASR) systems allow for the jo...
A longstanding debate in semiotics centers on the relationship between
l...
How language-agnostic are current state-of-the-art NLP tools? Are there ...
Weighted finite automata (WFA) are often used to represent probabilistic...
For general modeling methods applied to diverse languages, a natural que...