Pengcheng Guo

research

∙ 05/23/2023

BA-SOT: Boundary-Aware Serialized Output Training for Multi-Talker ASR

The recently proposed serialized output training (SOT) simplifies multi-...

0 Yuhao Liang, et al. ∙

research

∙ 11/24/2022

TESSP: Text-Enhanced Self-Supervised Speech Pre-training

Self-supervised speech pre-training empowers the model with the contextu...

0 Zhuoyuan Yao, et al. ∙

research

∙ 11/06/2022

Distinguishable Speaker Anonymization based on Formant and Fundamental Frequency Scaling

Speech data on the Internet are proliferating exponentially because of t...

0 Jixun Yao, et al. ∙

research

∙ 11/06/2022

Preserving background sound in noise-robust voice conversion via multi-task learning

Background sound is an informative form of art that is helpful in provid...

0 Jixun Yao, et al. ∙

research

∙ 10/11/2022

MFCCA:Multi-Frame Cross-Channel attention for multi-speaker ASR in Multi-party meeting scenario

Recently cross-channel attention, which better leverages multi-channel s...

0 Fan Yu, et al. ∙

research

∙ 09/24/2022

NWPU-ASLP System for the VoicePrivacy 2022 Challenge

This paper presents the NWPU-ASLP speaker anonymization system for Voice...

0 Jixun Yao, et al. ∙

research

∙ 07/02/2022

Improving Transformer-based Conversational ASR by Inter-Sentential Attention Mechanism

Transformer-based models have demonstrated their effectiveness in automa...

0 Kun Wei, et al. ∙

research

∙ 04/07/2022

Linguistic-Acoustic Similarity Based Accent Shift for Accent Recognition

General accent recognition (AR) models tend to directly extract low-leve...

0 Qijie Shao, et al. ∙

research

∙ 02/08/2022

Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge

The ICASSP 2022 Multi-channel Multi-party Meeting Transcription Grand Ch...

0 Fan Yu, et al. ∙

research

∙ 10/14/2021

M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge

Recent development of speech signal processing, such as speech recogniti...

0 Fan Yu, et al. ∙

research

∙ 10/09/2021

An Exploration of Self-Supervised Pretrained Representations for End-to-End Speech Recognition

Self-supervised pretraining on speech data has achieved a lot of progres...

0 Xuankai Chang, et al. ∙

research

∙ 10/07/2021

WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech Recognition

In this paper, we present WenetSpeech, a multi-domain Mandarin corpus co...

0 BinBin Zhang, et al. ∙

research

∙ 07/01/2021

ESPnet-ST IWSLT 2021 Offline Speech Translation System

This paper describes the ESPnet-ST group's IWSLT 2021 submission in the ...

0 Hirofumi Inaguma, et al. ∙

research

∙ 06/16/2021

Multi-Speaker ASR Combining Non-Autoregressive Conformer CTC and Conditional Speaker Chain

Non-autoregressive (NAR) models have achieved a large inference computat...

0 Pengcheng Guo, et al. ∙

research

∙ 04/10/2021

Boundary and Context Aware Training for CIF-based Non-Autoregressive End-to-end ASR

Continuous integrate-and-fire (CIF) based models, which use a soft and m...

0 Fan Yu, et al. ∙

research

∙ 12/23/2020

The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans

This paper describes the recent development of ESPnet (https://github.co...

0 Shinji Watanabe, et al. ∙

research

∙ 11/18/2020

Context-aware RNNLM Rescoring for Conversational Speech Recognition

Conversational speech recognition is regarded as a challenging task due ...

0 Kun Wei, et al. ∙

research

∙ 11/17/2020

Adversarial Training for Multi-domain Speaker Recognition

In real-life applications, the performance of speaker recognition system...

0 Qing Wang, et al. ∙

research

∙ 10/26/2020

Recent Developments on ESPnet Toolkit Boosted by Conformer

In this study, we present recent developments on ESPnet: End-to-End Spee...

0 Pengcheng Guo, et al. ∙

research

∙ 06/25/2020

Sequence to Multi-Sequence Learning via Conditional Chain Mapping for Mixture Signals

Neural sequence-to-sequence models are well established for applications...

0 Jing Shi, et al. ∙

research

∙ 05/21/2020

Inaudible Adversarial Perturbations for Targeted Attack in Speaker Recognition

Speaker recognition is a popular topic in biometric authentication and m...

0 Qing Wang, et al. ∙

research

∙ 06/16/2018

Study of Semi-supervised Approaches to Improving English-Mandarin Code-Switching Speech Recognition

In this paper, we present our overall efforts to improve the performance...

0 Pengcheng Guo, et al. ∙

Pengcheng Guo

Featured Co-authors

Sign in with Google

Consider DeepAI Pro