Recent advances in neural text-to-speech (TTS) models bring thousands of...
In this paper, we present ZeroPrompt (Figure 1-(a)) and the correspondin...
This paper summarizes the outcomes from the ISCSLP 2022 Intelligent Cock...
Recently, the unified streaming and non-streaming two-pass (U2/U2++)
end...
In this paper, we present TrimTail, a simple but effective emission
regu...
The recently proposed Conformer architecture which combines convolution ...
Speaker modeling is essential for many related tasks, such as speaker
re...
Keyword spotting (KWS) enables speech-based user interaction and gradual...
Recently, we made available WeNet, a production-oriented end-to-end spee...
In this paper, we present WenetSpeech, a multi-domain Mandarin corpus
co...
The reasonable definition of semantic interpretability presents the core...
The unified streaming and non-streaming two-pass (U2) end-to-end model f...
In this paper, we present a new open source, production first and produc...
In this paper, we present a novel two-pass approach to unify streaming a...
In this paper, we diagnose deep neural networks for 3D point cloud proce...
This paper proposes a set of rules to revise various neural networks for...
Deep learning models (DLMs) are state-of-the-art techniques in speech
re...