Shan Yang

research

∙ 08/17/2023

ICAR: Image-based Complementary Auto Reasoning

Scene-aware Complementary Item Retrieval (CIR) is a challenging task whi...

0 Xijun Wang, et al. ∙

research

∙ 03/18/2023

Local-to-Global Panorama Inpainting for Locale-Aware Indoor Lighting Prediction

Predicting panoramic indoor lighting from a single perspective image is ...

0 Jiayang Bai, et al. ∙

research

∙ 02/07/2023

Aligning Multi-Sequence CMR Towards Fully Automated Myocardial Pathology Segmentation

Myocardial pathology segmentation (MyoPS) is critical for the risk strat...

0 Wangbin Ding, et al. ∙

research

∙ 12/03/2022

UniSyn: An End-to-End Unified Model for Text-to-Speech and Singing Voice Synthesis

Text-to-speech (TTS) and singing voice synthesis (SVS) aim at generating...

0 Yi Lei, et al. ∙

research

∙ 11/06/2022

MyoPS-Net: Myocardial Pathology Segmentation with Flexible Combination of Multi-Sequence CMR Images

Myocardial pathology segmentation (MyoPS) can be a prerequisite for the ...

0 Junyi Qiu, et al. ∙

research

∙ 08/11/2022

TotalSegmentator: robust segmentation of 104 anatomical structures in CT images

In this work we focus on automatic segmentation of multiple anatomical s...

17 Jakob Wasserthal, et al. ∙

research

∙ 07/05/2022

Glow-WaveGAN 2: High-quality Zero-shot Text-to-speech Synthesis and Any-to-any Voice Conversion

The zero-shot scenario for speech generation aims at synthesizing a nove...

0 Yi Lei, et al. ∙

research

∙ 07/02/2022

Learning Noise-independent Speech Representation for High-quality Voice Conversion for Noisy Target Speakers

Building a voice conversion system for noisy target speakers, such as us...

0 Liumeng Xue, et al. ∙

research

∙ 06/15/2022

End-to-End Voice Conversion with Information Perturbation

The ideal goal of voice conversion is to convert the source speaker's sp...

0 Qicong Xie, et al. ∙

research

∙ 02/18/2022

VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion

Though significant progress has been made for speaker-dependent Video-to...

0 Disong Wang, et al. ∙

research

∙ 02/13/2022

Deep Graph Learning for Spatially-Varying Indoor Lighting Prediction

Lighting prediction from a single image is becoming increasingly importa...

5 Jiayang Bai, et al. ∙

research

∙ 01/17/2022

MsEmoTTS: Multi-scale emotion transfer, prediction, and control for emotional speech synthesis

Expressive synthetic speech is essential for many human-computer interac...

0 Yi Lei, et al. ∙

research

∙ 12/29/2021

A Color Image Steganography Based on Frequency Sub-band Selection

Color image steganography based on deep learning is the art of hiding in...

10 Hai Su, et al. ∙

research

∙ 09/08/2021

Referee: Towards reference-free cross-speaker style transfer with low-quality data for expressive speech synthesis

Cross-speaker style transfer (CSST) in text-to-speech (TTS) synthesis ai...

0 Songxiang Liu, et al. ∙

research

∙ 06/30/2021

Attention Bottlenecks for Multimodal Fusion

Humans perceive the world by concurrently processing and fusing high-dim...

24 Arsha Nagrani, et al. ∙

research

∙ 06/21/2021

Glow-WaveGAN: Learning Speech Representations from GAN-based Variational Auto-Encoder For High Fidelity Flow-based Speech Synthesis

Current two-stage TTS framework typically integrates an acoustic model w...

0 Jian Cong, et al. ∙

research

∙ 06/21/2021

Controllable Context-aware Conversational Speech Synthesis

In spoken conversations, spontaneous behaviors like filled pause and pro...

0 Jian Cong, et al. ∙

research

∙ 06/17/2021

Optical Mouse: 3D Mouse Pose From Single-View Video

We present a method to infer the 3D pose of mice, including the limbs an...

3 Bo Hu, et al. ∙

research

∙ 06/04/2021

Entity Concept-enhanced Few-shot Relation Extraction

Few-shot relation extraction (FSRE) is of great importance in long-tail ...

0 Shan Yang, et al. ∙

research

∙ 01/21/2021

Learn to Dance with AIST++: Music Conditioned 3D Dance Generation

In this paper, we present a transformer-based learning framework for 3D ...

0 Ruilong Li, et al. ∙

research

∙ 11/17/2020

Controllable Emotion Transfer For End-to-End Speech Synthesis

Emotion embedding space learned from references is a straightforward app...

0 Tao Li, et al. ∙

research

∙ 11/17/2020

Fine-grained Emotion Strength Transfer, Control and Prediction for Emotional Speech Synthesis

This paper proposes a unified model to conduct emotion transfer, control...

0 Yi Lei, et al. ∙

research

∙ 11/17/2020

Learn2Sing: Target Speaker Singing Voice Synthesis by learning from a Singing Teacher

Singing voice synthesis has been paid rising attention with the rapid de...

0 Heyang Xue, et al. ∙

research

∙ 08/10/2020

Data Efficient Voice Cloning from Noisy Samples with Domain Adversarial Training

Data efficient voice cloning aims at synthesizing target speaker's voice...

0 Jian Cong, et al. ∙

research

∙ 08/03/2020

Exploiting Deep Sentential Context for Expressive End-to-End Speech Synthesis

Attention-based seq2seq text-to-speech systems, especially those use sel...

0 Fengyu Yang, et al. ∙

research

∙ 05/11/2020

Multi-band MelGAN: Faster Waveform Generation for High-Quality Text-to-Speech

In this paper, we propose multi-band MelGAN, a much faster waveform gene...

0 Geng Yang, et al. ∙

research

∙ 04/28/2020

Adversarial Feature Learning and Unsupervised Clustering based Speech Synthesis for Found Data with Acoustic and Textual Noise

Attention-based sequence-to-sequence (seq2seq) speech synthesis has achi...

0 Shan Yang, et al. ∙

research

∙ 07/06/2017

Statistical Parametric Speech Synthesis Using Generative Adversarial Networks Under A Multi-task Learning Framework

In this paper, we aim at improving the performance of synthesized speech...

0 Shan Yang, et al. ∙

research

∙ 08/03/2016

Detailed Garment Recovery from a Single-View Image

Most recent garment capturing techniques rely on acquiring multiple view...

0 Shan Yang, et al. ∙

research

∙ 07/31/2016

Modeling Context in Referring Expressions

Humans refer to objects in their environments all the time, especially i...

0 Licheng Yu, et al. ∙

Shan Yang

Featured Co-authors

Sign in with Google

Consider DeepAI Pro