The objective of this paper is audio-visual synchronisation of general v...
Recent advances in visually-induced audio generation are based on sampli...
Acoustic and visual sensing can support the contactless estimation of th...
Human-robot object handover is a key skill for the future of human-robot...
Dense video captioning aims to localize and describe important events in...
Dense video captioning is a task of localizing interesting events from a...