Modern smart glasses leverage advanced audio sensing and machine learnin...
Recently reported state-of-the-art results in visual speech recognition ...
Recognizing a word shortly after it is spoken is an important requiremen...
Neural transducers have gained popularity in production ASR systems,
ach...
The two most popular loss functions for streaming end-to-end automatic s...
Graph-based temporal classification (GTC), a generalized form of the
con...
The recurrent neural network transducer (RNN-T) objective plays a major ...
Pseudo-labeling (PL), a semi-supervised learning (SSL) method where a se...
Attention-based end-to-end automatic speech recognition (ASR) systems ha...
Pseudo-labeling (PL) has been shown to be effective in semi-supervised
a...
This paper addresses end-to-end automatic speech recognition (ASR) for l...
Self-attention has become an important and widely used neural network
co...
The performance of automatic speech recognition (ASR) systems typically
...
Semi-supervised learning has demonstrated promising results in automatic...
We propose an unsupervised speaker adaptation method inspired by the neu...
Encoder-decoder based sequence-to-sequence models have demonstrated
stat...