Journal of Xidian University ›› 2019, Vol. 46 ›› Issue (4): 130-136.doi: 10.19665/j.issn1001-2400.2019.04.018
Previous Articles Next Articles
YUAN Wenhao,LIANG Chunyan,LOU Yingxi,FANG Chao,WANG Zhiqiang
Received:
2019-03-22
Online:
2019-08-20
Published:
2019-08-15
CLC Number:
YUAN Wenhao,LIANG Chunyan,LOU Yingxi,FANG Chao,WANG Zhiqiang. Speech enhancement method based on the time-frequency smoothing deep neural network[J].Journal of Xidian University, 2019, 46(4): 130-136.
"
噪声 | 信噪比/dB | 含噪语音 | 全连接神经网络 | 门控循环单元 | 时频平滑网络 |
---|---|---|---|---|---|
N1 | -7 | 1.62 | 1.65 | 2.06 | 2.24 |
0 | 2.08 | 2.25 | 2.65 | 2.78 | |
7 | 2.54 | 2.68 | 3.09 | 3.21 | |
N2 | -7 | 1.29 | 1.24 | 1.67 | 1.98 |
0 | 1.63 | 1.79 | 2.24 | 2.47 | |
7 | 2.08 | 2.32 | 2.73 | 2.90 | |
N3 | -7 | 1.49 | 1.45 | 1.78 | 2.10 |
0 | 1.81 | 1.92 | 2.32 | 2.62 | |
7 | 2.21 | 2.42 | 2.80 | 3.08 | |
N4 | -7 | 1.30 | 1.23 | 1.59 | 2.02 |
0 | 1.57 | 1.69 | 2.07 | 2.51 | |
7 | 1.97 | 2.17 | 2.56 | 2.92 |
"
噪声 | 信噪比/dB | 含噪语音 | 全连接神经网络 | 门控循环单元 | 时频平滑网络 |
---|---|---|---|---|---|
N1 | -7 | 0.61 | 0.58 | 0.71 | 0.71 |
0 | 0.76 | 0.73 | 0.84 | 0.84 | |
7 | 0.87 | 0.82 | 0.91 | 0.90 | |
N2 | -7 | 0.48 | 0.44 | 0.58 | 0.61 |
0 | 0.63 | 0.61 | 0.75 | 0.76 | |
7 | 0.79 | 0.75 | 0.86 | 0.86 | |
N3 | -7 | 0.53 | 0.45 | 0.60 | 0.67 |
0 | 0.69 | 0.64 | 0.78 | 0.82 | |
7 | 0.84 | 0.80 | 0.88 | 0.90 | |
N4 | -7 | 0.52 | 0.46 | 0.63 | 0.67 |
0 | 0.69 | 0.65 | 0.78 | 0.80 | |
7 | 0.84 | 0.79 | 0.88 | 0.88 |
[1] |
刘文举, 聂帅, 梁山 , 等. 基于深度学习语音分离技术的研究现状与进展[J]. 自动化学报, 2016,42(6):819-833.
doi: 10.16383/j.aas.2016.c150734 |
LIU Wenju, NIE Shuai, LIANG Shan , et al. Deep Learning Based Speech Separation Technology and Its Developments[J]. Acta Automatica Sinica, 2016,42(6):819-833.
doi: 10.16383/j.aas.2016.c150734 |
|
[2] | WANG D L, CHEN J . Supervised Speech Separation Based on Deep Learning: An Overview[J]. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2018,26(10):1702-1726. |
[3] | WANG Q, DU J, DAI L R , et al. A Multiobjective Learning and Ensembling Approach to High-performance Speech Enhancement with Compact Neural Network Architectures[J]. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2018,26(7):1185-1197. |
[4] | WANG Y, WANG D L . Towards Scaling Up Classification-based Speech Separation[J]. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2013,21(7):1381-1390. |
[5] | WANG Y, NARAYANAN A, WANG D L . On Training Targets for Supervised Speech Separation[J]. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2014,22(12):1849-1858. |
[6] | WILLIAMSON D S, WANG D L . Time-frequency Masking in the Complex Domain for Speech Dereverberation and Denoising[J]. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2017,25(7):1492-1501. |
[7] | XU Y, DU J, DAI L R , et al. An Experimental Study on Speech Enhancement Based on Deep Neural Networks[J]. IEEE Signal Processing Letters, 2014,21(1):65-68. |
[8] | XU Y, DU J, DAI L R , et al. A Regression Approach to Speech Enhancement Based on Deep Neural Networks[J]. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2015,23(1):7-19. |
[9] | HUANG P S, KIM M, HASEGAWA-JOHNSON M , et al. Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation[J]. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2015,23(12):2136-2147. |
[10] | WENINGER F, ERDOGAN H, WATANABE S. et al. Speech Enhancement with LSTM Recurrent Neural Networks and Its Application to Noise-robust ASR [C]//Lecture Notes in Computer Science: 9237. Heidelberg: Springer Verlag, 2015: 91-99. |
[11] | CHEN J, WANG D . Long Short-term Memory for Speaker Generalization in Supervised Speech Separation[J]. Journal of the Acoustical Society of America, 2017,141(6):4705-4714. |
[12] | PARK S R, LEE J M. A Fully Convolutional Neural Network for Speech Enhancement [C]//Proceedings of the 2017 Annual Conference of the International Speech Communication Association. Baixas: International Speech Communication Association, 2017: 1993-1997. |
[13] | FU S W, TSAO Y, LU X. SNR-aware Convolutional Neural Network Modeling for Speech Enhancement [C]//Proceedings of the 2016 Annual Conference of the International Speech Communication Association. Baixas: International Speech Communication Association, 2016: 3768-3772. |
[14] | LOIZOU P C. Speech Enhancement: Theory and Practice[M]. Boca Raton: CRC Press, 2013. |
[15] | COHEN I . Noise Spectrum Estimation in Adverse Environments: Improved Minima Controlled Recursive Averaging[J]. IEEE Transactions on Speech and Audio Processing, 2003,11(5):466-475. |
[16] | GAROFOLO J S, LAMEL L F, FISHER W M , et al. TIMIT Acoustic-phonetic Continuous Speech Corpus [EB/OL]. [2018-09-10].https://catalog.ldc.upenn.edu/LDC93S1. |
[17] | HU G . 100 Nonspeech Environmental Sounds[EB/OL]. [ 2018- 09- 03]. http://web.cse.ohio-state.edu/pnl/corpus/HuNonspeech/HuCorpus.html. |
[18] | VARGA A, STEENEKEN H J M . Assessment for Automatic Speech Recognition: II. NOISEX-92: A Database and an Experiment to Study the Effect of Additive Noise on Speech Recognition Systems[J]. Speech Communication, 1993,12(3):247-251. |
[19] | RIX A W, BEERENDS J G, HOLLIER M P. et al. Perceptual Evaluation of Speech Quality (PESQ)-a New Method for Speech Quality Assessment of Telephone Networks and Codecs [C]//Proceedings of the 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Piscataway: IEEE, 2001: 749-752. |
[20] | TAAL C H, HENDRIKS R C, HEUSDENS R , et al. An Algorithm for Intelligibility Prediction of Time-frequency Weighted Noisy Speech[J]. IEEE/ACM Transactions on Audio, Speech and Language Processing, 2011,19(7):2125-2136. |
[1] | YU Haoyang,YIN Liang,LI Shufang,LV Shun. Recognition algorithm for the little sample radar modulation signal based on the generative adversarial network [J]. Journal of Xidian University, 2021, 48(6): 96-104. |
[2] | SUN Yanjing,WEI Li,ZHANG Nianlong,YUN Xiao,DONG Kaiwen,GE Min,CHENG Xiaozhou,HOU Xiaofeng. Person re-identification method combining the DD-GAN and Global feature in a coal mine [J]. Journal of Xidian University, 2021, 48(5): 201-211. |
[3] | ZHOU Peng,YANG Jun. Semantic segmentation of remote sensing images based on neural architecture search [J]. Journal of Xidian University, 2021, 48(5): 47-57. |
[4] | ZHANG Shuwei,LI Junmin. Human body detection algorithm in complex monitoring scenes [J]. Journal of Xidian University, 2021, 48(5): 68-77. |
[5] | YANG Yunhang,MIN Lianquan. Multi-scalefusion sketch recognition model by dilated convolution [J]. Journal of Xidian University, 2021, 48(5): 92-99. |
[6] | CHEN Changchuan,WANG Haining,HUANG Lian,HUANG Tao,LI Lianjie,HUANG Xiangkang,DAI Shaosheng. Facial expression recognition based on local representation [J]. Journal of Xidian University, 2021, 48(5): 100-109. |
[7] | SONG Jianfeng,MIAO Qiguang,WANG Chongxiao,XU Hao,YANG Jin. Multi-scale single object tracking based on the attention mechanism [J]. Journal of Xidian University, 2021, 48(5): 110-116. |
[8] | ZHANG Yuhao,CHENG Peitao,ZHANG Shuhao,WANG Xiumei. Lightweight image super-resolution with the adaptive weight learning network [J]. Journal of Xidian University, 2021, 48(5): 15-22. |
[9] | HUI Haisheng,ZHANG Xueying,WU Zelin,LI Fenglian. Method for stroke lesion segmentation using the primary-auxiliary path attention compensation network [J]. Journal of Xidian University, 2021, 48(4): 200-208. |
[10] | WANG Ping,JIANG Yuze,ZHAO Guanghui. Object detection based on the multiscale location Enhancement network [J]. Journal of Xidian University, 2021, 48(3): 85-90. |
[11] | CAO Yi,CAI Xiaodong. Effective learning strategy for hard samples [J]. Journal of Xidian University, 2021, 48(3): 99-105. |
[12] | MEI Shulin,JIA Hairong,WANG Xiaogang,WU Yifeng. Combination of dynamic features with a new mask to optimize neural network speech enhancement [J]. Journal of Xidian University, 2021, 48(3): 91-98. |
[13] | GUO Zekun,TIAN Long,HAN Ning,WANG Penghui,LIU Hongwei,CHEN Bo. Radar HRRP based few-shot target recognition with CNN-SSD [J]. Journal of Xidian University, 2021, 48(2): 7-14. |
[14] | LIU Jieyi,GONG Maoguo,ZHAN Tao,LI Hao,ZHANG Mingyang. Method for discrimination of false targets in multistation radar systems based on the deep neural network [J]. Journal of Xidian University, 2021, 48(2): 133-138. |
[15] | ZHANG Hua,GAO Haoran,YANG Xingguo,LI Wenmin,GAO Fei,WEN Qiaoyan. TargetedFool:an algorithm for achieving targeted attacks [J]. Journal of Xidian University, 2021, 48(1): 149-159. |
|