Hand-crafted Attention is All You Need? A Study of Attention on Self-supervised Audio Transformer

06/09/2020
by   Tsung-Han Wu, et al.
0

In this paper, we seek to reduce the computation complexity of transformer-based models for speech representation learning. We evaluate 10 attention mechanisms; then, we pre-train the transformer-based model with those attentions in a self-supervised fashion and use them as feature extractors on downstream tasks, including phoneme classification and speaker classification. We find that the proposed approach, which only uses hand-crafted and learnable attentions, is comparable with the full self-attention.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset