Recent research has demonstrated that the multi-task fine-tuning of
mult...
General-purpose language models that can solve various language-domain t...
This paper proposes a framework for quantitatively evaluating interactiv...
We present NusaCrowd, a collaborative initiative to collect and unite
ex...
Large-scale vision-language pre-trained (VLP) models are prone to halluc...
With the rise of deep learning and intelligent vehicles, the smart assis...
Code-switching is a speech phenomenon when a speaker switches language d...
While the recent advances in deep neural networks (DNN) bring remarkable...
Multimodal abstractive summarization (MAS) models that summarize videos
...
Multimodal affect recognition constitutes an important aspect for enhanc...
Existing works on multimodal affective computing tasks, such as emotion
...
Cross-domain named entity recognition (NER) models are able to cope with...
Lay summarization aims to generate lay summaries of scientific papers
au...
Multi-hop Question Generation (QG) aims to generate answer-related quest...
Despite the recent achievements made in the multi-modal emotion recognit...
Nowadays, offensive content in social media has become a serious problem...