Although end-to-end (E2E) trainable automatic speech recognition (ASR) h...
Although automatic emotion recognition (AER) has recently drawn signific...
Manually annotating fine-grained slot-value labels for task-oriented dia...
In automatic emotion recognition (AER), labels assigned by different hum...
End-to-end automatic speech recognition (ASR) and large language models,...
This paper proposes handling training data sparsity in speech-based auto...
End-to-end (E2E) automatic speech recognition (ASR) implicitly learns th...
Automatic emotion recognition in conversation (ERC) is crucial for
emoti...
Self-supervised learning via masked prediction pre-training (MPPT) has s...
End-to-end spoken language understanding (SLU) suffers from the long-tai...
In speaker diarisation, speaker embedding extraction models often suffer...
Self-supervised-learning-based pre-trained models for speech data, such ...
Incorporating biasing words obtained as contextual knowledge is critical...
Contextual knowledge is essential for reducing speech recognition errors...
Emotion recognition is a key attribute for artificial intelligence syste...
As end-to-end automatic speech recognition (ASR) models reach promising
...
Contextual knowledge is important for real-world automatic speech recogn...
Language models (LMs) pre-trained on massive amounts of text, in particu...
End-to-end models with auto-regressive decoders have shown impressive re...
This paper presents a novel natural gradient and Hessian-free (NGHF)
opt...
In this paper, a novel two-branch neural network model structure is prop...
For various speech-related tasks, confidence scores from a speech recogn...
In this paper, we propose a semi-supervised learning (SSL) technique for...
Speaker diarisation systems nowadays use embeddings generated from speec...
This paper proposes a novel method for supervised data clustering. The
c...
This paper proposes a novel automatic speech recognition (ASR) framework...
Deep Neural Network (DNN) acoustic models often use discriminative seque...
This paper describes the extension and optimization of our previous work...