ROTUNDE - A Smart Meeting Cinematography Initiative: Tools, Datasets, and Benchmarks for Cognitive Interpretation and Control
We construe smart meeting cinematography with a focus on professional situations such as meetings and seminars, possibly conducted in a distributed manner across socio-spatially separated groups. The basic objective in smart meeting cinematography is to interpret professional interactions involving people, and automatically produce dynamic recordings of discussions, debates, presentations etc in the presence of multiple communication modalities. Typical modalities include gestures (e.g., raising one's hand for a question, applause), voice and interruption, electronic apparatus (e.g., pressing a button), movement (e.g., standing-up, moving around) etc. ROTUNDE, an instance of smart meeting cinematography concept, aims to: (a) develop functionality-driven benchmarks with respect to the interpretation and control capabilities of human-cinematographers, real-time video editors, surveillance personnel, and typical human performance in everyday situations; (b) Develop general tools for the commonsense cognitive interpretation of dynamic scenes from the viewpoint of visuo-spatial cognition centred perceptual narrativisation. Particular emphasis is placed on declarative representations and interfacing mechanisms that seamlessly integrate within large-scale cognitive (interaction) systems and companion technologies consisting of diverse AI sub-components. For instance, the envisaged tools would provide general capabilities for high-level commonsense reasoning about space, events, actions, change, and interaction.
READ FULL TEXT