The current speech anti-spoofing countermeasures (CMs) show excellent
pe...
The detection of spoofing speech generated by unseen algorithms remains ...
The wav2vec 2.0 and integrated spectro-temporal graph attention network
...
Neural networks have been able to generate high-quality single-sentence
...
When labeled data is insufficient, semi-supervised learning with the
pse...
Utilizing the large-scale unlabeled data from the target domain via
pseu...
This report describes our submission to track1 and track3 for VoxCeleb
S...
Previous research in speech enhancement has mostly focused on modeling t...
ECAPA-TDNN is currently the most popular TDNN-series model for speaker
v...
Selecting application scenarios matching data is important for the autom...
Recently, convolutional neural networks (CNNs) have been widely used in ...
In this paper, we propose a two-stage heterogeneous lightweight network ...
This paper describes the deepfake audio detection system submitted to th...
Code-switching automatic speech recognition becomes one of the most
chal...
The cross-domain performance of automatic speech recognition (ASR) could...
Automatic speech recognition (ASR) with federated learning (FL) makes it...
Voice conversion models have developed for decades, and current mainstre...
Recently, audio-visual scene classification (AVSC) has attracted increas...
Probabilistic Linear Discriminant Analysis (PLDA) was the dominant and
n...
Previous research has looked into ways to improve speech emotion recogni...
Recently, end-to-end automatic speech recognition models based on
connec...
The voice conversion task is to modify the speaker identity of continuou...
While Transformers have achieved promising results in end-to-end (E2E)
a...
Self-supervised acoustic pre-training has achieved amazing results on th...
Self-supervised pre-training has dramatically improved the performance o...
Recently, dual-path networks have achieved promising performance due to ...
Recently neural architecture search(NAS) has been successfully used in i...
When only limited target domain data is available, domain adaptation cou...
When only a limited amount of accented speech data is available, to prom...
Access to large corpora with strongly labelled sound events is expensive...
With the success of deep learning in speech signal processing,
speaker-i...
In acoustic scene classification (ASC), acoustic features play a crucial...
In recent years, the involvement of synthetic strongly labeled data,weak...
Recently, Transformer has gained success in automatic speech recognition...
Utterance-level permutation invariant training (uPIT) has achieved promi...
Recently, researchers set an ambitious goal of conducting speaker recogn...
This technical report describes the IOA team's submission for TASK1A of
...
This paper presents the contribution to the third 'CHiME' speech separat...