Conformer, combining convolution and self-attention sequentially to capt...
Spoken language understanding (SLU) tasks involve mapping from speech au...
Progress in speech processing has been facilitated by shared datasets an...
Automatic speech recognition (ASR) models make fewer errors when more
su...
In this paper, we explore the use of pre-trained language models to lear...
Speaker diarization is a task to label audio or video recordings with cl...
This paper presents multistream CNN, a novel neural network architecture...
In this paper we present state-of-the-art (SOTA) performance on the
Libr...
In this study, we propose a new spectral clustering framework that can
a...
Self-attention has been a huge success for many downstream tasks in NLP,...
In this paper we show how we have achieved the state-of-the-art performa...