DCTM: Dilated Convolutional Transformer Model for Multimodal Engagement Estimation in Conversation

07/31/2023
by   Vu Ngoc Tu, et al.
0

Conversational engagement estimation is posed as a regression problem, entailing the identification of the favorable attention and involvement of the participants in the conversation. This task arises as a crucial pursuit to gain insights into human's interaction dynamics and behavior patterns within a conversation. In this research, we introduce a dilated convolutional Transformer for modeling and estimating human engagement in the MULTIMEDIATE 2023 competition. Our proposed system surpasses the baseline models, exhibiting a noteworthy 7% improvement on test set and 4% on validation set. Moreover, we employ different modality fusion mechanism and show that for this type of data, a simple concatenated method with self-attention fusion gains the best performance.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset