In this work, we introduce a framework for cross-lingual speech synthesi...
Neural text-to-speech systems are often optimized on L1/L2 losses, which...
In expressive speech synthesis it is widely adopted to use latent prosod...
Non-parallel voice conversion (VC) is typically achieved using lossy
rep...
Artificial speech synthesis has made a great leap in terms of naturalnes...
Whilst recent neural text-to-speech (TTS) approaches produce high-qualit...