Generalized LSTM-based End-to-End Text-Independent Speaker Verification

11/10/2020
by   Soroosh Tayebi Arasteh, et al.
0

The increasing amount of available data and more affordable hardware solutions have opened a gate to the realm of Deep Learning (DL). Due to the rapid advancements and ever-growing popularity of DL, it has begun to invade almost every field, where machine learning is applicable, by altering the traditional state-of-the-art methods. While many researchers in the speaker recognition area have also started to replace the former state-of-the-art methods with DL techniques, some of the traditional i-vector-based methods are still state-of-the-art in the context of text-independent speaker verification (TI-SV). In this paper, we discuss the most recent generalized end-to-end (GE2E) DL technique based on Long Short-term Memory (LSTM) units for TI-SV by Google and compare different scenarios and aspects including utterance duration, training time, and accuracy to prove that our method outperforms the traditional methods.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset