Democratizing access to natural language processing (NLP) technology is
...
As large language models (LLMs) continue to advance, accurately and
comp...
Large Language Models (LLMs) are capable of performing zero-shot closed-...
This evidence-based position paper critiques current research practices
...
Figurative language permeates human communication, but at the same time ...
Instruction tuning has shown great promise in the field of natural langu...
Despite the major advances in NLP, significant disparities in NLP system...
Extracting structured and grounded fact triples from raw text is a
funda...
This paper aims to explore the potential of leveraging Large Language Mo...
Multilingual Large Language Models (LLMs) have recently shown great
capa...
There has been a surge of interest in utilizing Knowledge Graphs (KGs) f...
Large language models (LLMs) with instruction finetuning demonstrate sup...
While code-mixing is a common linguistic practice in many parts of the w...
Code-Switching, a common phenomenon in written text and conversation, ha...
We present NusaCrowd, a collaborative initiative to collect and unite
ex...
The BLOOM model is a large open-source multilingual language model capab...
Multitask prompted finetuning (MTF) has been shown to help large languag...
We introduce Mintaka, a complex, natural, and multilingual dataset desig...
At the center of the underlying issues that halt Indonesian natural lang...
Natural language processing (NLP) has a significant impact on society vi...
We release our synthetic parallel paraphrase corpus across 17 languages:...
We propose Nix-TTS, a lightweight neural TTS (Text-to-Speech) model achi...
NLP research is impeded by a lack of resources and awareness of the
chal...
In recent years, large-scale data collection efforts have prioritized th...
We perform knowledge distillation (KD) benchmark from task-specific BERT...
We present IndoNLI, the first human-elicited NLI dataset for Indonesian....
We explore two types of monolingual data that can be included in knowled...
Neural machine translation (NMT) is typically domain-dependent and
style...
Recent advances in Natural Language Processing (NLP) have largely pushed...
In its daily use, the Indonesian language is riddled with informality, t...
Neural Machine Translation (NMT) is resource intensive. We design a
quan...
Asynchronous stochastic gradient descent (SGD) is attractive from a spee...
Previous work in Indonesian part-of-speech (POS) tagging are hard to com...
In order to extract the best possible performance from asynchronous
stoc...
We present Marian, an efficient and self-contained Neural Machine Transl...
We make distributed stochastic gradient descent faster by exchanging spa...