Efficient Monaural Speech Enhancement using Spectrum Attention Fusion

08/04/2023
by   Jinyu Long, et al.
0

Speech enhancement is a demanding task in automated speech processing pipelines, focusing on separating clean speech from noisy channels. Transformer based models have recently bested RNN and CNN models in speech enhancement, however at the same time they are much more computationally expensive and require much more high quality training data, which is always hard to come by. In this paper, we present an improvement for speech enhancement models that maintains the expressiveness of self-attention while significantly reducing model complexity, which we have termed Spectrum Attention Fusion. We carefully construct a convolutional module to replace several self-attention layers in a speech Transformer, allowing the model to more efficiently fuse spectral features. Our proposed model is able to achieve comparable or better results against SOTA models but with significantly smaller parameters (0.58M) on the Voice Bank + DEMAND dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/28/2023

PCNN: A Lightweight Parallel Conformer Neural Network for Efficient Monaural Speech Enhancement

Convolutional neural networks (CNN) and Transformer have wildly succeede...
research
02/06/2022

On Using Transformers for Speech-Separation

Transformers have enabled major improvements in deep learning. They ofte...
research
05/06/2021

Speech Enhancement using Separable Polling Attention and Global Layer Normalization followed with PReLU

Single channel speech enhancement is a challenging task in speech commun...
research
05/15/2023

Ripple sparse self-attention for monaural speech enhancement

The use of Transformer represents a recent success in speech enhancement...
research
09/04/2023

Single-Channel Speech Enhancement with Deep Complex U-Networks and Probabilistic Latent Space Models

In this paper, we propose to extend the deep, complex U-Network architec...
research
10/13/2019

T-GSA: Transformer with Gaussian-weighted self-attention for speech enhancement

Transformer neural networks (TNN) demonstrated state-of-art performance ...
research
06/30/2021

DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement

Single-channel speech enhancement (SE) is an important task in speech pr...

Please sign up or login with your details

Forgot password? Click here to reset