Subword tokenization is the de facto standard for tokenization in neural...
Large language models (LLMs) have demonstrated impressive few-shot learn...
We hypothesize that existing sentence-level machine translation (MT) met...
State-of-the-art multilingual systems rely on shared vocabularies that
s...
Common acquisition functions for active learning use either uncertainty ...
In Natural Language Processing (NLP), pre-trained language models (LMs) ...