Discrete audio representation, aka audio tokenization, has seen renewed
...
This paper presents an overview and evaluation of some of the end-to-end...
Large language models (LLMs) have shown great promise for capturing
cont...
We study speech intent classification and slot filling (SICSF) by propos...
We present AmberNet, a compact end-to-end neural network for Spoken Lang...
Speaker diarization systems are challenged by a trade-off between the
te...
End-to-end automatic speech recognition systems have achieved great accu...
In the English speech-to-text (STT) machine learning task, acoustic mode...