We present a methodology to train our multi-speaker emotional text-to-sp...
Although Perplexity is a widely used performance metric for language mod...
We report a GPT-based multi-sentence language model for dialogue generat...
We aim to separate the generative factors of data into two latent vector...
Voice conversion (VC) is a task to transform a person's voice to differe...
Emotion is not limited to discrete categories of happy, sad, angry, fear...
Many speech enhancement methods try to learn the relationship between no...
This paper introduces a deep neural network model for subband-based spee...
Multi-task learning is a method for improving the generalizability of
mu...
Multi-task learning (MTL) is one of the method for improving generalizab...
We propose a neural text-to-speech (TTS) model that can imitate a new
sp...
In this paper, we introduce an emotional speech synthesizer based on the...
Convolutional neural networks (CNNs) with convolutional and pooling
oper...
This paper describes a Hierarchical Composition Recurrent Network (HCRN)...