India has a rich linguistic landscape with languages from 4 major langua...
We create publicly available language identification (LID) datasets and
...
Large language models have demonstrated the capability to perform well o...
Adapters have been positioned as a parameter-efficient fine-tuning (PEFT...
We address the task of machine translation from an extremely low-resourc...
The rapid growth of machine translation (MT) systems has necessitated
co...
We present, Naamapadam, the largest publicly available Named Entity
Reco...
In this work, we introduce IndicXTREME, a benchmark consisting of nine
d...
End-to-end (E2E) models have become the default choice for state-of-the-...
A cornerstone in AI research has been the creation and adoption of
stand...
We introduce Aksharantar, the largest publicly available transliteration...
Recent methods in speech and language technology pretrain very LARGE mod...
In this paper, we present an extensive investigation of multi-bridge,
ma...
In this paper we present IndicBART, a multilingual, sequence-to-sequence...
Multilingual Language Models (MLLMs) such as mBERT, XLM, XLM-R, etc.
hav...
This work introduces Itihasa, a large-scale translation dataset containi...
We present Samanantar, the largest publicly available parallel corpora
c...
We present the IndicNLP corpus, a large-scale, general-domain corpus
con...
We propose a geometric framework for learning meta-embeddings of words f...
In this work, we present an extensive study of statistical machine
trans...
We present a survey on multilingual neural machine translation (MNMT), w...
We present a survey on multilingual neural machine translation (MNMT), w...
Transfer learning approaches for Neural Machine Translation (NMT) train ...
In this paper, we introduce McTorch, a manifold optimization library for...
We propose a novel geometric approach for learning bilingual mappings gi...
We propose a novel geometric approach for learning bilingual mappings gi...
We present the IIT Bombay English-Hindi Parallel Corpus. The corpus is a...
We investigate pivot-based translation between related languages in a lo...
A common and effective way to train translation systems between related
...
We explore the use of segments learnt using Byte Pair Encoding (referred...
We explore the use of the orthographic syllable, a variable-length
conso...