While direction of arrival (DOA) of sound events is generally estimated ...
We present a multi-speaker Japanese audiobook text-to-speech (TTS) syste...
This report presents the Sony-TAu Realistic Spatial Soundscapes 2022
(ST...
Data-based and learning-based sound source localization (SSL) has shown
...
Bilingual English speakers speak English as one of their languages. Thei...
This report presents the dataset and baseline of Task 3 of the DCASE2021...
Sound event localization and detection is a novel area of research that
...
This report presents the dataset and the evaluation setup of the Sound E...
The perceptual quality of neural text-to-speech (TTS) is highly dependen...
This paper presents the sound event localization and detection (SELD) ta...
This paper investigates the joint localization, detection, and tracking ...
In this paper, we propose a convolutional recurrent neural network for j...
In this paper, we propose a stacked convolutional and recurrent neural
n...
This paper proposes a deep neural network for estimating the directions ...
This paper proposes a neural network architecture and training scheme to...
In this paper, we compare the performance of using binaural audio featur...
We present the first approach to automated audio captioning. We employ a...
Bird sounds possess distinctive spectral structure which may exhibit sma...