We introduce AudioPaLM, a large language model for speech understanding ...
We present SoundStorm, a model for efficient, non-autoregressive audio
g...
We introduce SPEAR-TTS, a multi-speaker text-to-speech (TTS) system that...
We introduce AudioLM, a framework for high-quality audio generation with...
We introduce dGSLM, the first "textless" model able to generate audio sa...
Textless spoken language processing research aims to extend the applicab...
Speech emotion conversion is the task of modifying the perceived emotion...
Training data memorization in NLP can both be beneficial (e.g., closed-b...
Speech pre-training has primarily demonstrated efficacy on classificatio...
Despite their practical success, modern seq2seq architectures are unable...
As deep networks begin to be deployed as autonomous agents, the issue of...
We present the Zero Resource Speech Challenge 2021, which asks participa...
We propose using self-supervised discrete representations for the task o...
Contrastive Predictive Coding (CPC), based on predicting future segments...
Sequence-to-sequence (seq2seq) learners are widely used, but we still ha...
Natural language allows us to refer to novel composite concepts by combi...
Studies of discrete languages emerging when neural agents communicate to...
There is renewed interest in simulating language emergence among deep ne...
There is a growing interest in studying the languages emerging when neur...
Despite renewed interest in emergent language simulations with neural
ne...
Sequence-processing neural networks led to remarkable progress on many N...