Recognition of personalized content remains a challenge in end-to-end sp...
In interactive automatic speech recognition (ASR) systems, low-latency
r...
Machine learning model weights and activations are represented in
full-p...
We present dual-attention neural biasing, an architecture designed to bo...
End-to-End (E2E) automatic speech recognition (ASR) systems used in voic...
Despite improvements to the generalization performance of automated spee...
We present a streaming, Transformer-based end-to-end automatic speech
re...
We present a novel sub-8-bit quantization-aware training (S8BQAT) scheme...
Second-pass rescoring is an important component in automatic speech
reco...
Recent years have seen significant advances in end-to-end (E2E) spoken
l...
End-to-end (E2E) automatic speech recognition (ASR) systems often have
d...
Spoken language understanding (SLU) systems translate voice input comman...
We present Bifocal RNN-T, a new variant of the Recurrent Neural Network
...
As more speech processing applications execute locally on edge devices, ...
We introduce Amortized Neural Networks (AmNets), a compute cost- and
lat...
Language modeling (LM) for automatic speech recognition (ASR) does not
u...
Comprehending the overall intent of an utterance helps a listener recogn...
Wav2vec-C introduces a novel representation learning technique combining...
The recognition of personalized content, such as contact names, remains ...
Spoken language understanding (SLU) systems extract transcriptions, as w...
In order to achieve high accuracy for machine learning (ML) applications...
As voice assistants become more ubiquitous, they are increasingly expect...
Accents mismatching is a critical problem for end-to-end ASR. This paper...
End-to-end automatic speech recognition (ASR) systems, such as recurrent...
We consider the problem of spoken language understanding (SLU) of extrac...
In this paper, we propose a streaming model to distinguish voice queries...
Decomposing models into multiple components is critically important in m...
Multilingual ASR technology simplifies model training and deployment, bu...
Grapheme-to-phoneme (G2P) models are a key component in Automatic Speech...
This paper presents our modeling and architecture approaches for buildin...
End-to-end approaches for automatic speech recognition (ASR) benefit fro...
Neural language models (NLM) have been shown to outperform conventional
...
Language models (LM) for interactive speech recognition systems are trai...
This article presents a whisper speech detector in the far-field domain....
In this work, we propose a classifier for distinguishing device-directed...
Statistical language models (LM) play a key role in Automatic Speech
Rec...
Statistical language models (LM) play a key role in Automatic Speech
Rec...