Gary Wang

research

∙ 08/14/2023

Using Text Injection to Improve Recognition of Personal Identifiers in Speech

Accurate recognition of specific categories, such as persons' names, dat...

0 Yochai Blau, et al. ∙

research

∙ 04/27/2023

Understanding Shared Speech-Text Representations

Recently, a number of approaches to train speech models by incorpo-ratin...

0 Gary Wang, et al. ∙

research

∙ 03/02/2023

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages

We introduce the Universal Speech Model (USM), a single large model that...

0 Yu Zhang, et al. ∙

research

∙ 10/31/2022

Modular Hybrid Autoregressive Transducer

Text-only adaptation of a transducer model remains challenging for end-t...

0 Zhong Meng, et al. ∙

research

∙ 10/27/2022

Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech

This paper proposes Virtuoso, a massively multilingual speech-text joint...

0 Takaaki Saeki, et al. ∙

research

∙ 10/19/2022

G-Augment: Searching for the Meta-Structure of Data Augmentation Policies for ASR

Data augmentation is a ubiquitous technique used to provide robustness t...

0 Gary Wang, et al. ∙

research

∙ 09/15/2022

Non-Parallel Voice Conversion for ASR Augmentation

Automatic speech recognition (ASR) needs to be robust to speaker differe...

0 Gary Wang, et al. ∙

research

∙ 08/22/2022

Autonomous Ground Navigation in Highly Constrained Spaces: Lessons learned from The BARN Challenge at ICRA 2022

The BARN (Benchmark Autonomous Robot Navigation) Challenge took place at...

0 Xuesu Xiao, et al. ∙

research

∙ 05/16/2022

Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data

Building inclusive speech recognition systems is a crucial step towards ...

0 Alëna Aksënova, et al. ∙

research

∙ 08/27/2021

Injecting Text in Self-Supervised Speech Pretraining

Self-supervised pretraining for Automated Speech Recognition (ASR) has s...

0 Zhehuai Chen, et al. ∙

research

∙ 03/11/2019

Deep Text-to-Speech System with Seq2Seq Model

Recent trends in neural network based text-to-speech/speech synthesis pi...

0 Gary Wang, et al. ∙

research

∙ 07/16/2018

Tiered Object Storage using Persistent Memory

Most data intensive applications often access only a few fields of the o...

0 Johnu George, et al. ∙

Gary Wang

Featured Co-authors

Sign in with Google

Consider DeepAI Pro