We propose a new framework for extracting visual information about a sce...
What audio embedding approach generalizes best to a wide range of downst...
We release synth1B1, a multi-modal audio corpus consisting of 1 billion
...
We introduce a data-driven approach to automatic pitch correction of sol...
We describe a machine-learning approach to pitch correcting a solo singi...
There are many applications scenarios for which the computational perfor...