Denoising diffusion probabilistic models (diffusion models for short) re...
Cross-speaker style transfer is crucial to the applications of multi-sty...
Neural end-to-end TTS can generate very high-quality synthesized speech,...
In this paper, we introduce the Variational Autoencoder (VAE) to an
end-...