In a noisy conversation environment such as a dinner party, people often...
We introduce the novel-view acoustic synthesis (NVAS) task: given the si...
Can conversational videos captured from multiple egocentric viewpoints r...
Audio-visual speech enhancement aims to extract clean speech from a nois...
Most speech enhancement (SE) models learn a point estimate, and do not m...
We propose to characterize and improve the performance of blind room imp...
We present RemixIT, a simple yet effective self-supervised method for
tr...
Impulse response estimation in high noise and in-the-wild settings, with...
Augmented reality devices have the potential to enhance human perception...
Deep learning approaches have emerged that aim to transform an audio sig...
Augmented Reality (AR) as a platform has the potential to facilitate the...
Transfer learning is critical for efficient information transfer across
...
Estimating camera wearer's body pose from an egocentric view (egopose) i...
An important problem in machine auditory perception is to recognize and
...
Moving around in the world is naturally a multisensory experience, but
t...
Weakly supervised learning algorithms are critical for scaling audio eve...