Journal of Xidian University ›› 2022, Vol. 49 ›› Issue (3): 183-190.doi: 10.19665/j.issn1001-2400.2022.03.020

• Computer Science and Technology & Artificial Intelligence • Previous Articles     Next Articles

Convolutional quasi-recurrent network for real-time speech enhancement

SHI Yunlong(),YUAN Wenhao(),HU Shaodong(),LOU Yingxi()   

  1. School of Computer Science and Technology,Shandong University of Technology,Zibo 255000,China
  • Received:2021-05-25 Revised:2021-12-08 Online:2022-06-20 Published:2022-07-04
  • Contact: Wenhao YUAN E-mail:syljoy@163.com;why_sdut@126.com;hsd_sdut@163.com;lyx_joy@163.com

Abstract:

To improve the speech enhancement performance of deep neural networks under the premise of ensuring the real-time performance,a convolutional quasi-recurrent network for real-time speech enhancement is proposed.The network uses a causal input,and it only uses the time-frequency domain features of the current and past frames of the noisy speech to meet the input requirements of the real-time speech enhancement method.The network uses the quasi-recurrent neural network to model the correlation of the noisy speech in the time domain,and uses its parallel calculations capability for the noisy speech sequences to improve the computational efficiency of the model.The network uses the convolutional layer to improve the feature extraction method of the quasi-recurrent neural network for the frequency domain feature of the noisy speech,which enables the network to better utilize the local correlation between the adjacent frequency bands of the noisy speech and improve the performance of speech enhancement.Experimental results show that,compared with the speech enhancement method based on the quasi-recurrent network,the speech enhancement method based on the convolutional quasi-recurrent network not only improves the speech enhancement performance,but also reduces the parameter number of the network model.Compared with existing methods,the convolutional quasi-recurrent network effectively suppresses the interference of background noise on the target speech,reduces the distortion of the target speech,and has a better speech enhancement performance under the premise of ensuring the causal input.The real-time performance of the speech enhancement method based on the convolutional quasi-recurrent network is verified on different computing platforms.

Key words: speech enhancement, quasi-recurrent network, convolutional neural network, real-time performance

CLC Number: 

  • TN912