Whisper is a powerful automatic speech recognition (ASR) model. Neverthe...
BERTScore is an effective and robust automatic metric for referencebased...
Vision Transformers have attracted a lot of attention recently since the...
The capability of generating speech with specific type of emotion is des...
However, current autoregressive approaches suffer from high latency. In ...
In the development of neural text-to-speech systems, model pre-training ...
This paper describes our work in participation of the IWSLT-2021 offline...
This paper describes a novel design of a neural network-based speech
gen...