Yanzhang He

research

∙ 05/24/2023

RAND: Robustness Aware Norm Decay For Quantized Seq2seq Models

With the rapid increase in the size of neural networks, model compressio...

0 David Qiu, et al. ∙

research

∙ 03/15/2023

Sharing Low Rank Conformer Weights for Tiny Always-On Ambient Speech Recognition Models

Continued improvements in machine learning techniques offer exciting new...

0 Steven M. Hernandez, et al. ∙

research

∙ 11/28/2022

E2E Segmentation in a Two-Pass Cascaded Encoder ASR Model

We explore unifying a neural segmenter with two-pass cascaded encoder AS...

0 W. Ronny Huang, et al. ∙

research

∙ 11/01/2022

Unified End-to-End Speech Recognition and Endpointing for Fast and Efficient Speech Systems

Automatic speech recognition (ASR) systems typically rely on an external...

0 Shaan Bijwadia, et al. ∙

research

∙ 08/29/2022

A Language Agnostic Multilingual Streaming On-Device ASR System

On-device end-to-end (E2E) models have shown improvements over a convent...

1 Bo Li, et al. ∙

research

∙ 08/29/2022

Turn-Taking Prediction for Natural Conversational Speech

While a streaming voice assistant system has been used in many applicati...

0 Shuo-yiin Chang, et al. ∙

research

∙ 06/29/2022

Improving Deliberation by Text-Only and Semi-Supervised Training

Text-only and semi-supervised training based on audio-only data has gain...

0 Ke Hu, et al. ∙

research

∙ 04/15/2022

Improving Rare Word Recognition with LM-aware MWER Training

Language models (LMs) significantly improve the recognition accuracy of ...

0 Weiran Wang, et al. ∙

research

∙ 04/13/2022

A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes

In this paper, we propose a dynamic cascaded encoder Automatic Speech Re...

0 Shaojin Ding, et al. ∙

research

∙ 04/08/2022

Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition

Personalization of on-device speech recognition (ASR) has seen explosive...

0 Shaojin Ding, et al. ∙

research

∙ 03/29/2022

4-bit Conformer with Native Quantization Aware Training for Speech Recognition

Reducing the latency and model size has always been a significant resear...

0 Shaojin Ding, et al. ∙

research

∙ 02/24/2022

Closing the Gap between Single-User and Multi-User VoiceFilter-Lite

VoiceFilter-Lite is a speaker-conditioned voice separation model that pl...

0 Rajeev Rikhye, et al. ∙

research

∙ 10/30/2021

Cross-attention conformer for context modeling in speech enhancement for ASR

This work introduces cross-attention conformer, an attention-based archi...

0 Arun Narayanan, et al. ∙

research

∙ 10/07/2021

Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition

As end-to-end automatic speech recognition (ASR) models reach promising ...

0 Qiujia Li, et al. ∙

research

∙ 10/01/2021

Large-scale ASR Domain Adaptation using Self- and Semi-supervised Learning

Self- and semi-supervised learning methods have been actively investigat...

0 Dongseong Hwang, et al. ∙

research

∙ 09/15/2021

Tied Reduced RNN-T Decoder

Previous works on the Recurrent Neural Network-Transducer (RNN-T) models...

0 Rami Botros, et al. ∙

research

∙ 07/02/2021

Multi-user VoiceFilter-Lite via Attentive Speaker Embedding

In this paper, we propose a solution to allow speaker conditioned speech...

0 Rajeev Rikhye, et al. ∙

research

∙ 04/28/2021

Personalized Keyphrase Detection using Speaker and Environment Information

In this paper, we introduce a streaming keyphrase detection system that ...

0 Rajeev Rikhye, et al. ∙

research

∙ 04/26/2021

Multi-Task Learning for End-to-End ASR Word and Utterance Confidence with Deletion Prediction

Confidence scores are very useful for downstream applications of automat...

0 David Qiu, et al. ∙

research

∙ 03/11/2021

Learning Word-Level Confidence For Subword End-to-End ASR

We study the problem of word-level confidence estimation in subword-base...

0 David Qiu, et al. ∙

research

∙ 12/12/2020

Less Is More: Improved RNN-T Decoding Using Limited Label Context and Path Merging

End-to-end models that condition the output label sequence on all previo...

5 Rohit Prabhavalkar, et al. ∙

research

∙ 11/21/2020

A Better and Faster End-to-End Model for Streaming ASR

End-to-end (E2E) models have shown to outperform state-of-the-art conven...

0 Bo Li, et al. ∙

research

∙ 10/22/2020

Confidence Estimation for Attention-based Sequence-to-sequence Models for Speech Recognition

For various speech-related tasks, confidence scores from a speech recogn...

0 Qiujia Li, et al. ∙

research

∙ 10/21/2020

FastEmit: Low-latency Streaming ASR with Sequence-level Emission Regularization

Streaming automatic speech recognition (ASR) aims to emit each hypothesi...

5 Jiahui Yu, et al. ∙

research

∙ 09/09/2020

VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device Speech Recognition

We introduce VoiceFilter-Lite, a single-channel source separation model ...

3 Quan Wang, et al. ∙

research

∙ 08/30/2020

Parallel Rescoring with Transformer for Streaming On-Device Speech Recognition

Recent advances of end-to-end models have outperformed conventional mode...

0 Wei Li, et al. ∙

research

∙ 06/02/2020

Analyzing the Quality and Stability of a Streaming End-to-End On-Device Speech Recognizer

The demand for fast and accurate incremental speech recognition increase...

0 Yuan Shangguan, et al. ∙

research

∙ 03/28/2020

A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency

Thus far, end-to-end (E2E) models have not been shown to outperform stat...

0 Tara N. Sainath, et al. ∙

research

∙ 08/29/2019

Two-Pass End-to-End Speech Recognition

The requirements for many applications of state-of-the-art speech recogn...

0 Tara N. Sainath, et al. ∙

research

∙ 02/21/2019

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

Lingvo is a Tensorflow framework offering a complete solution for collab...

13 Jonathan Shen, et al. ∙

research

∙ 11/15/2018

Streaming End-to-end Speech Recognition For Mobile Devices

End-to-end (E2E) models, which directly predict output character sequenc...

0 Yanzhang He, et al. ∙

research

∙ 10/26/2017

Streaming Small-Footprint Keyword Spotting using Sequence-to-Sequence Models

We develop streaming keyword spotting systems using a recurrent neural n...

0 Yanzhang He, et al. ∙

Yanzhang He

Featured Co-authors

Sign in with Google

Consider DeepAI Pro