Image registration (IR) is a process that deforms images to align them w...
Sketch-Based Image Retrieval (SBIR) is a crucial task in multimedia
retr...
Sketch-based image retrieval (SBIR) is the task of retrieving natural im...
The development of virtual agents has enabled human-avatar interactions ...
This work aims at generating captions for soccer videos using deep learn...
Event classification is inherently sequential and multimodal. Therefore,...
In this paper, we propose a study on multi-modal (audio and video) actio...
This paper aims to bring a new lightweight yet powerful solution for the...
We introduce the AVECL-UMons dataset for audio-visual event classificati...
Understanding expressed sentiment and emotions are two crucial factors i...
Recently, generative adversarial networks (GAN) have gathered a lot of
i...
As new data-sets for real-world visual reasoning and compositional quest...
Even with the growing interest in problems at the intersection of Comput...
When searching for an object humans navigate through a scene using seman...
Neural Image Captioning (NIC) or neural caption generation has attracted...
This paper describes the UMONS solution for the Multimodal Machine
Trans...
The 11th Summer Workshop on Multimodal Interfaces eNTERFACE 2015 was hos...
We propose a new and fully end-to-end approach for multimodal translatio...
In Multimodal Neural Machine Translation (MNMT), a neural model generate...
In state-of-the-art Neural Machine Translation (NMT), an attention mecha...
In state-of-the-art Neural Machine Translation, an attention mechanism i...