Journal of Xidian University ›› 2021, Vol. 48 ›› Issue (3): 91-98.doi: 10.19665/j.issn1001-2400.2021.03.012
• Computer Science and Technology & Artificial Intelligence • Previous Articles Next Articles
MEI Shulin1(),JIA Hairong1(),WANG Xiaogang2(),WU Yifeng2()
Received:
2019-12-12
Online:
2021-06-20
Published:
2021-07-05
Contact:
Hairong JIA
E-mail:1243748225@qq.com;helenjia722@163.com;wangxg117@chinaunicom.cn;wyf911@126.com
CLC Number:
MEI Shulin,JIA Hairong,WANG Xiaogang,WU Yifeng. Combination of dynamic features with a new mask to optimize neural network speech enhancement[J].Journal of Xidian University, 2021, 48(3): 91-98.
"
噪声 | 信噪比 | SegSNR | |||
---|---|---|---|---|---|
带噪语音 | 算法1 | 算法2 | 算法3 | ||
10 | -16.201 5 | 2.462 3 | 2.530 9 | 3.934 6 | |
5 | -20.808 9 | 0.484 8 | 0.546 7 | 1.648 9 | |
F16噪声 | 0 | -25.111 2 | -1.937 2 | -1.839 4 | -0.965 9 |
-5 | -29.335 9 | -6.660 9 | -6.423 7 | -5.423 4 | |
-10 | -33.786 7 | -13.235 6 | -13.567 2 | -12.013 4 | |
10 | -15.912 8 | 0.860 0 | 1.537 8 | 2.367 4 | |
5 | -20.593 7 | -1.876 5 | -1.456 9 | 0.244 2 | |
Babble噪声 | 0 | -25.050 4 | -6.401 2 | -5.589 | -4.578 0 |
-5 | -28.822 8 | -7.881 1 | -6.889 8 | -6.035 6 | |
-10 | -32.134 5 | -14.125 6 | -13.126 7 | -12.332 4 |
"
噪声 | 信噪比/dB | PESQ | |||
---|---|---|---|---|---|
带噪语音 | 算法1 | 算法2 | 算法3 | ||
10 | 2.197 6 | 2.623 8 | 2.666 7 | 2.719 7 | |
5 | 2.031 2 | 2.121 4 | 2.353 9 | 2.665 8 | |
F16噪声 | 0 | 1.674 7 | 2.016 8 | 2.122 6 | 2.428 4 |
-5 | 1.462 7 | 1.950 6 | 1.523 9 | 2.261 5 | |
-10 | 1.121 4 | 1.210 7 | 1.443 7 | 2.183 7 | |
10 | 2.235 1 | 2.558 5 | 2.602 0 | 2.610 3 | |
5 | 1.784 7 | 2.329 4 | 2.564 5 | 2.590 9 | |
Babble噪声 | 0 | 1.422 7 | 1.796 8 | 2.110 3 | 2.253 8 |
-5 | 1.143 7 | 1.567 8 | 1.994 7 | 2.167 4 | |
-10 | 0.986 9 | 1.099 4 | 1.234 8 | 2.010 2 |
"
噪声 | 信噪比/dB | STOI | |||
---|---|---|---|---|---|
带噪语音 | 算法1 | 算法2 | 算法3 | ||
10 | 0.800 0 | 0.831 0 | 0.870 4 | 0.906 1 | |
5 | 0.769 3 | 0.800 1 | 0.818 9 | 0.859 9 | |
F16噪声 | 0 | 0.712 0 | 0.766 8 | 0.794 0 | 0.814 0 |
-5 | 0.666 3 | 0.702 0 | 0.759 0 | 0.775 0 | |
-10 | 0.643 0 | 0.660 0 | 0.678 9 | 0.696 3 | |
10 | 0.799 4 | 0.829 6 | 0.875 1 | 0.897 6 | |
5 | 0.752 0 | 0.812 1 | 0.823 0 | 0.864 3 | |
Babble噪声 | 0 | 0.695 6 | 0.835 2 | 0.836 1 | 0.836 6 |
-5 | 0.676 6 | 0.719 2 | 0.744 1 | 0.787 3 | |
-10 | 0.610 7 | 0.655 5 | 0.660 0 | 0.694 2 |
[1] | 贾海蓉, 王卫梅, 王雁, 等. 区分性联合稀疏字典交替优化的语音增强[J]. 西安电子科技大学学报, 2019,46(3):74-81. |
JIA Hairong, WANG Weimei, WANG Yan, et al. Speech Enhancement Based on Discriminative Joint Sparse Dictionaryalternate Optimization[J]. Journal of Xidian University, 2019,46(3):74-81. | |
[2] | 袁文浩, 娄迎曦, 梁春燕, 等. 感知联合优化的深度神经网络语音增强方法[J]. 西安电子科技大学学报, 2019,46(2):89-94. |
YUAN Wenhao, LOU Yingxi, LIANG Chunyan, et al. Speech Enhancement Method Based on the Perceptual Joint Optimization Deep Neural Network[J]. Journal of Xidian University, 2019,46(2):89-94. | |
[3] |
MOHAMMADIHA N, SMARAGDIS P, LEIJON A. Supervised and Unsupervised Speech Enhancement Using Nonnegative Matrix Factorization[J]. IEEE Transactions on Audio,Speech,and Language Processing, 2013,21(10):2140-2151.
doi: 10.1109/TASL.2013.2270369 |
[4] |
WANG Y, NARAYANAN A, WANG D L. On Training Targets for Supervised Speech Separation[J]. IEEE/ACM Transactions on Audio,Speech,and Language Processing, 2014,22(12):1849-1858.
doi: 10.1109/TASLP.2014.2352935 |
[5] | 李保明, 付小宁. 基于理想组合掩蔽的监督性语音增强算法[J]. 计算机科学与应用, 2018,8(4):546-552. |
LI Baoming, FU Xiaoning. Supervised Speech Enhancement Algorithm Based on Phase Spectrum Estimation[J]. Computer Science and Application, 2018,8(4):546-552. | |
[6] | 王雁, 贾海蓉, 吉慧芳, 等. 特征联合优化深度信念网络的语音增强算法[J]. 计算机工程与应用, 2019,55(9):38-42. |
WANG Yan, JIA Hairong, JI Huifang, et al. Feature Joint Optimization of Deep Belief Network for Speech Enhancement[J]. Computer Engineering and Applications, 2019,55(9):38-42. | |
[7] | BAO F, ABDULLA W H. Noise Masking Method Based on an Effective Ratio Mask Estimation in Gammatone Channels[J]. APSIPA Transactions on Signal and Information Processing, 2018,7:1-12. |
[8] | 郭欣, 贾海蓉, 王栋. 利用子空间改进的K-SVD语音增强算法[J]. 西安电子科技大学学报, 2016,43(6):109-115. |
GUO Xin, JIA Hairong, WANG Dong. Speech Enhancement Using the Improved K-SVD Algorithm by Subspace[J]. Journal of Xidian University, 2016,43(6):109-115. | |
[9] |
LI R, SUN X, LIU Y, et al. Multi-resolution Auditory Cepstral Coefficient and Adaptive Mask for Speech Enhancement with Deep Neural Network[J]. Eurasip Journal on Advances in Signal Processing, 2019,2019(1):22.
doi: 10.1186/s13634-019-0618-4 |
[10] | British Standards Institution. Specification for Normal Equal-loudness Level Contours for Pure Tones Under Free-field Listening Conditions:BS-3383:1988[S]. 1988. |
[11] | 白静, 史燕燕, 薛珮芸, 等. 融合非线性幂函数和谱减法的CFCC特征提取[J]. 西安电子科技大学学报, 2019,46(1):86-92. |
BAI Jing, SHI Yanyan, XUE Peiyun, et al. CFCC Feature Extraction for Fusion of the Power-law Nonlinearity Function and Spectral Subtraction[J]. Journal of Xidian University, 2019,46(1):86-92. | |
[12] |
XU Y, DU J, DAI D R, et al. A regression Approach to Speech Enhancement Based on Deep Neural Network[J]. IEEE/ACM Transactions on Audio,Speech,and Language Processing, 2015,23(1):7-19.
doi: 10.1109/TASLP.6570655 |
[13] | 刘文举, 聂帅, 梁山, 等. 基于深度学习语音分离技术的研究现状与进展[J]. 自动化学报, 2016,42(6):819-833. |
LIU Wenjiu, NIE Shuai, LIANG Shan, et al. Deep Learning Based Speech Separation Technology and Its Developments[J]. Acta Automatica Sinica, 2016,42(6):819-833. | |
[14] |
BAO F, ABDULLA W H. A New Time-frequency Binary Mask Estimation Method Based on Convex Optimization of Speech Power[J]. Speech Communication, 2018,97:51-65.
doi: 10.1016/j.specom.2018.01.002 |
[15] |
HE Q, BAO F, BAO C. Multiplicative Update of Auto-regressive Gains for Codebook-based Speech Enhancement[J]. IEEE/ACM Transactions on Audio,Speech,and Language Processing, 2017,25(3):457-468.
doi: 10.1109/TASLP.2016.2636445 |
[16] | 袁文浩, 梁春燕, 娄迎曦, 等. 一种时频平滑的深度神经网络语音增强方法[J]. 西安电子科技大学学报, 2019,46(4):130-136. |
YUAN Wenhao, LIANG Chunyan, LOU Yingxi, et al. Speech Enhancement Method Based on the Time-frequency Smoothing Deep Neural Network[J]. Journal of Xidian University, 2019,46(4):130-136. | |
[17] |
BAO F, ABDULLA W H. A New Ratio Mask Representation for CASA-based Speech Enhancement[J]. IEEE/ACM Transactions on Audio,Speech,and Language Processing, 2019,27(1):7-19.
doi: 10.1109/TASLP.2018.2868407 |
[1] | LV Wenkai,YANG Pengfei,DING Yunqing,ZHANG Heyu,ZHENG Tianyang. JEDERL:A task scheduling optimization algorithm for heterogeneous computing platforms [J]. Journal of Xidian University, 2021, 48(6): 67-74. |
[2] | YU Haoyang,YIN Liang,LI Shufang,LV Shun. Recognition algorithm for the little sample radar modulation signal based on the generative adversarial network [J]. Journal of Xidian University, 2021, 48(6): 96-104. |
[3] | HU Daiwang,JIAO Yiyuan,LI Yanni. Novel and efficient algorithm for entity relation extraction with the corpus knowledge graph [J]. Journal of Xidian University, 2021, 48(6): 75-83. |
[4] | SUN Yanjing,WEI Li,ZHANG Nianlong,YUN Xiao,DONG Kaiwen,GE Min,CHENG Xiaozhou,HOU Xiaofeng. Person re-identification method combining the DD-GAN and Global feature in a coal mine [J]. Journal of Xidian University, 2021, 48(5): 201-211. |
[5] | ZHOU Peng,YANG Jun. Semantic segmentation of remote sensing images based on neural architecture search [J]. Journal of Xidian University, 2021, 48(5): 47-57. |
[6] | ZHANG Shuwei,LI Junmin. Human body detection algorithm in complex monitoring scenes [J]. Journal of Xidian University, 2021, 48(5): 68-77. |
[7] | YANG Yunhang,MIN Lianquan. Multi-scalefusion sketch recognition model by dilated convolution [J]. Journal of Xidian University, 2021, 48(5): 92-99. |
[8] | DONG Ruchan,JIAO Licheng,ZHAO Jin,SHEN Weiyan. Application of the deep fusion mechanism in object detection of remote sensing images [J]. Journal of Xidian University, 2021, 48(5): 128-138. |
[9] | CHENG De,HAO Yi,ZHOU Jingyu,WANG Nannan,GAO Xinbo. Cross-modality person re-identification utilizing the hybrid two-stream neural networks [J]. Journal of Xidian University, 2021, 48(5): 190-200. |
[10] | CHEN Changchuan,WANG Haining,HUANG Lian,HUANG Tao,LI Lianjie,HUANG Xiangkang,DAI Shaosheng. Facial expression recognition based on local representation [J]. Journal of Xidian University, 2021, 48(5): 100-109. |
[11] | SONG Jianfeng,MIAO Qiguang,WANG Chongxiao,XU Hao,YANG Jin. Multi-scale single object tracking based on the attention mechanism [J]. Journal of Xidian University, 2021, 48(5): 110-116. |
[12] | ZHANG Yuhao,CHENG Peitao,ZHANG Shuhao,WANG Xiumei. Lightweight image super-resolution with the adaptive weight learning network [J]. Journal of Xidian University, 2021, 48(5): 15-22. |
[13] | HAN Yongsai,MA Shiping,HE Linyuan,LI Chenghao,ZHU Mingming,ZHANG Fei. Detection of the object in the fast remote sensing airport area on the improved YOLOv3 [J]. Journal of Xidian University, 2021, 48(5): 156-166. |
[14] | HUI Haisheng,ZHANG Xueying,WU Zelin,LI Fenglian. Method for stroke lesion segmentation using the primary-auxiliary path attention compensation network [J]. Journal of Xidian University, 2021, 48(4): 200-208. |
[15] | SONG Jianqiao,WANG Feng,NIU Jin,SHI Zezhou,MA Junhui. Potential emotion recognition based on the fusion of the spatio-temporal neural network and facial pulse signals [J]. Journal of Xidian University, 2021, 48(4): 159-167. |
|