BinBin Zhang

research

∙ 08/31/2023

LightGrad: Lightweight Diffusion Probabilistic Model for Text-to-Speech

Recent advances in neural text-to-speech (TTS) models bring thousands of...

0 Jie Chen, et al. ∙

research

∙ 05/18/2023

ZeroPrompt: Streaming Acoustic Encoders are Zero-Shot Masked LMs

In this paper, we present ZeroPrompt (Figure 1-(a)) and the correspondin...

0 Xingchen Song, et al. ∙

research

∙ 11/03/2022

The ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge (ICSRC): Dataset, Tracks, Baseline and Results

This paper summarizes the outcomes from the ISCSLP 2022 Intelligent Cock...

0 Ao Zhang, et al. ∙

research

∙ 11/02/2022

Fast-U2++: Fast and Accurate End-to-End Speech Recognition in Joint CTC/Attention Frames

Recently, the unified streaming and non-streaming two-pass (U2/U2++) end...

0 Chengdong Liang, et al. ∙

research

∙ 11/01/2022

TrimTail: Low-Latency Streaming ASR with Simple but Effective Spectrogram-Level Length Penalty

In this paper, we present TrimTail, a simple but effective emission regu...

0 Xingchen Song, et al. ∙

research

∙ 10/31/2022

FusionFormer: Fusing Operations in Transformer for Efficient Streaming Speech Recognition

The recently proposed Conformer architecture which combines convolution ...

0 Xingchen Song, et al. ∙

research

∙ 10/31/2022

Wespeaker: A Research and Production oriented Speaker Embedding Learning Toolkit

Speaker modeling is essential for many related tasks, such as speaker re...

0 Hongji Wang, et al. ∙

research

∙ 10/30/2022

WeKws: A production first small-footprint end-to-end Keyword Spotting Toolkit

Keyword spotting (KWS) enables speech-based user interaction and gradual...

0 Jie Wang, et al. ∙

research

∙ 03/29/2022

WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit

Recently, we made available WeNet, a production-oriented end-to-end spee...

0 BinBin Zhang, et al. ∙

research

∙ 10/07/2021

WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech Recognition

In this paper, we present WenetSpeech, a multi-domain Mandarin corpus co...

0 BinBin Zhang, et al. ∙

research

∙ 07/09/2021

Interpretable Compositional Convolutional Neural Networks

The reasonable definition of semantic interpretability presents the core...

0 Wen Shen, et al. ∙

research

∙ 06/10/2021

U2++: Unified Two-pass Bidirectional End-to-end Model for Speech Recognition

The unified streaming and non-streaming two-pass (U2) end-to-end model f...

0 Di Wu, et al. ∙

research

∙ 02/02/2021

WeNet: Production First and Production Ready End-to-End Speech Recognition Toolkit

In this paper, we present a new open source, production first and produc...

0 BinBin Zhang, et al. ∙

research

∙ 12/10/2020

Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition

In this paper, we present a novel two-pass approach to unify streaming a...

0 BinBin Zhang, et al. ∙

research

∙ 11/20/2019

Utility Analysis of Network Architectures for 3D Point Cloud Processing

In this paper, we diagnose deep neural networks for 3D point cloud proce...

8 Shikun Huang, et al. ∙

research

∙ 11/20/2019

3D-Rotation-Equivariant Quaternion Neural Networks

This paper proposes a set of rules to revise various neural networks for...

26 BinBin Zhang, et al. ∙

research

∙ 03/17/2017

Empirical Evaluation of Parallel Training Algorithms on Acoustic Modeling

Deep learning models (DLMs) are state-of-the-art techniques in speech re...

0 Wenpeng Li, et al. ∙

BinBin Zhang

Featured Co-authors

Sign in with Google

Consider DeepAI Pro