Sifting through vast textual data and summarizing key information impose...
Vision-language models (VLMs), such as CLIP and ALIGN, are generally tra...
We systematically investigate lightweight strategies to adapt large lang...
Multimodal models trained on large natural image-text pair datasets have...
Radiology report summarization is a growing area of research. Given the
...
Neural image-to-text radiology report generation systems offer the poten...
Machine learning models that achieve high overall accuracy often make
sy...
This paper aims to bring a new lightweight yet powerful solution for the...
Understanding expressed sentiment and emotions are two crucial factors i...
Recently, generative adversarial networks (GAN) have gathered a lot of
i...
As new data-sets for real-world visual reasoning and compositional quest...
Even with the growing interest in problems at the intersection of Comput...
When searching for an object humans navigate through a scene using seman...
Neural Image Captioning (NIC) or neural caption generation has attracted...
This paper describes the UMONS solution for the Multimodal Machine
Trans...
This paper describes the UMONS solution for the OMG-Emotion Challenge. W...
We propose a new and fully end-to-end approach for multimodal translatio...
In Multimodal Neural Machine Translation (MNMT), a neural model generate...
In state-of-the-art Neural Machine Translation (NMT), an attention mecha...
In state-of-the-art Neural Machine Translation, an attention mechanism i...