Journal of Xidian University ›› 2021, Vol. 48 ›› Issue (4): 168-175.doi: 10.19665/j.issn1001-2400.2021.04.022
• Computer Science and Technology & Cyberspace Security • Previous Articles Next Articles
WANG Yong1(),SU Zhuoyi1(),ZHU Zhengyu1,2()
Received:
2020-06-01
Online:
2021-08-30
Published:
2021-08-31
Contact:
Zhengyu ZHU
E-mail:isswy@mail.sysu.edu.cn;364085901@qq.com;zhuzhengyu0701@163.com
CLC Number:
WANG Yong,SU Zhuoyi,ZHU Zhengyu. Detection of voice transformation spoofing using the dense convolutional neural network[J].Journal of Xidian University, 2021, 48(4): 168-175.
"
训练数据集 | 测试数据集 | 加入10 dB的噪声 | 加入15 dB的噪声 | 加入20 dB的噪声 | 加入30 dB的噪声 | 干净的语音 |
---|---|---|---|---|---|---|
Timit-1 | Timit-2 | 97.65 | 98.56 | 99.09 | 99.15 | 99.18 |
NIST-1 | NIST-2 | 91.97 | 91.97 | 93.82 | 96.07 | 96.86 |
UME-1 | UME-2 | 91.82 | 91.82 | 94.12 | 94.94 | 96.31 |
N-1&T-1 | UME-2 | 87.62 | 87.62 | 93.58 | 95.84 | 96.16 |
N-1&U-1 | Timit-2 | 90.21 | 90.21 | 95.06 | 95.91 | 96.22 |
T-1&U-1 | NIST-2 | 75.28 | 75.28 | 80.07 | 80.91 | 81.37 |
均值 | 89.09 | 89.09 | 92.62 | 93.80 | 94.35 |
[1] | PERROT P, AVERSANO G. Voice Disguise and Automatic Detection:Review and Perspectives[C]//Progress in Nonlinear Speech Processing.Berlin:Springer-Verlag, 2007:101-117. |
[2] | GOMEZ-ALANIS A, PEINADO A M, GONZALEZ J A, et al. A Gated Recurrent Convolutional Neural Network for Robust Spoofing Detection[J]. IEEE/ACM Transactions on Audio,Speech,and Language Processing (TASLP), 2019, 27(12):1985-1999. |
[3] | ALAM J, KENNY P. Spoofing Detection Employing Infinite Impulse Response—Constant Q Transform-Based Feature Representations[C]//Proceedings of the 2017 25th European Signal Processing Conference(EUSIPCO).Piscataway:IEEE, 2017:101-105. |
[4] | HANILCI C. Speaker Verification Anti-Spoofing Using Linear Prediction Residual Phase Features[C]//Proceedings of the 2017 25th European Signal Processing Conference(EUSIPCO).Piscataway:IEEE, 2017:96-100. |
[5] | KAMBLE M R, PATIL H. Novel Energy Separation Based Instantaneous Frequency Features for Spoof Speech Detection[C]//Proceedings of the 2017 25th European Signal Processing Conference (EUSIPCO).Piscataway:IEEE, 2017:106-110. |
[6] |
MUCKENHIM H, KORSHUNOV P, MAGIMAI-DOSS M, et al. Long-Term Spectral Statistics for Voice Presentation Attack Detection[J]. IEEE/ACM Transactions on Audio Speech and Language Processing, 2017, 25(11):2098-2111.
doi: 10.1109/TASLP.2017.2743340 |
[7] | DINKEL H, QIAN Y, YU K. Small-Footprint Convolutional Neural Network for Spoofing Detection[C]//Proceedings of the 2017 International Joint Conference on Neural Networks(IJCNN).Piscataway:IEEE, 2017:3086-3091. |
[8] |
SAHIDULLAH M, THOMSEN D A L, HAUTAMAKI R G, et al. Robust Voice Liveness Detection and Speaker Verification Using Throat Microphones[J]. IEEE/ACM Transactions on Audio Speech and Language Processing, 2018, 26(1):44-56.
doi: 10.1109/TASLP.2017.2760243 |
[9] | LEE K, PARK C, KIM N, et al. Accelerating Recurrent Neural Network Language Model Based Online Speech Recognition System[C]//Proceedings of the 2018 IEEE International Conference on Acoustics,Speech and Signal Processing.Piscataway:IEEE, 2018:5904-5908. |
[10] | SAILOR H, KAMBLE M, PATIL H. Auditory Filterbank Learning for Temporal Modulation Features in Replay Spoof Speech Detection[C]//Proceedings of the Interspeech.Piscataway:IEEE, 2018:666-670. |
[11] | KUMAR M G, KUMAR R S. Spoof Detection Using Time-Delay Shallow Neural Network and Feature Switching[C]//Proceedings of the 2019 IEEE Automatic Speech Recognition and Understanding Workshop(ASRU).Piscataway:IEEE, 2019:1011-1017. |
[12] |
GOMEZ-ALANIS A, GONZALEZ-LOPEZ A, PEINADO A M. A Kernel Density Estimation Based Loss Function and Its Application to ASV-Spoofing Detection[J]. IEEE Access, 2020, 8:108530-108543.
doi: 10.1109/Access.6287639 |
[13] |
BALAMURALI B T, LIN K, LUI S, et al. Toward Robust Audio Spoofing Detection:a Detailed Comparison of Traditional and Learned Features[J]. IEEE Access, 2019, 7:84229-84241.
doi: 10.1109/Access.6287639 |
[14] | KAMBLE M, PATIL H. Analysis of Reverberation via Teager Energy Features for Replay Spoof Speech Detection[C]//Proceedings of the 2019 IEEE International Conference on Acoustics,Speech and Signal Processing.Piscataway:IEEE, 2019:2607-2611. |
[15] | YE Y, LAO L, YAN D, et al. Detection of Replay Attack Based on Normalized Constant Q Cepstral Feature[C]//Proceedings of the 2019 IEEE 4th International Conference on Cloud Computing and Big Data Analysis.Piscataway:IEEE, 2019:407-411. |
[16] | NOSEK T, SUZIC S, PAPIC B, et al. Synthesized Speech Detection Based on Spectrogram and Convolutional Neural Networks[C]//Proceedings of the 2019 27th Telecommunications Forum.Belgrade:Serbia, 2019:1-4. |
[17] | ACHARYA R, PATIL H, KOTTA H. Novel Enhanced Teager Energy Based Cepstral Coefficients for Replay Spoof Detection[C]//Proceedings of the 2019 IEEE Automatic Speech Recognition and Understanding Workshop.Piscataway:IEEE, 2019:342-349. |
[18] | MALIK K M, JAVED A, MALIK H, et al. A Light-Weight Replay Detection Framework for Voice Controlled IoT Devices[J]. IEEE Journal of Selected Topics in Signal Processing,Early Access Article, 2020, 14(5):982-996. |
[19] | KAMBLE M R, KRISHNA SAI P A. Speech Demodulation-Based Techniques for Replay and Presentation Attack Detection[C]//Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference.Piscataway:IEEE, 2019:1545-1550. |
[20] | SINITCA A M, EFIMCHIK N V, SHALUGIN E D, et al. Voice Antispoofing System Vulnerabilities Research[C]//Proceedings of the 2020 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering,St.Piscataway:IEEE, 2020:505-508. |
[21] | MONTEIRO J, ALAM J, FALK T H. An Ensemble Based Approach for Generalized Detection of Spoofing Attacks to Automatic Speaker Recognizers[C]//Proceedings of the 2020 IEEE International Conference on Acoustics,Speech and Signal Processing.Piscataway:IEEE, 2020:6599-6603. |
[22] | WANG Y, DENG Y H, WU H J, et al. Blind Detection of Electronic Voice Transformation with Natural Disguise[C]//Proceedings of the Digital Forensics and Watermaking,LNCS 7809.Berlin:Springer-Varlag, 2013:336-343. |
[23] |
WU H, WANG Y, HUANG J. Identification of Electronic Disguised Voices[J]. IEEE Transactions on Information Forensics and Security, 2014, 9(3):489-500.
doi: 10.1109/TIFS.2014.2301912 |
[24] | WU H, WANG Y, HUANG J. Blind Detection of Electronic Disguised Voice[C]//Proceedings of the IEEE International Conference on Acoustics,Speech and Signal Processing.Piscataway:IEEE, 2013:3013-3017. |
[25] | LIANG H, LIN X, ZHANG Q, et al. Recognition of Spoofed Voice Using Convolutional Neural Networks[C]//Proceedings of the 2017 IEEE Global Conference on Signal and Information Processing(GlobalSIP).Piscataway:IEEE, 2017:293-297. |
[26] | LAROCHE J. Time and Pitch Scale Modification of Audio Signals[M]. Applications of Digital Signal Processing to Audio and Acoustics.Moscow:Kluwer Academic Publishers, 2002:279-310. |
[27] |
TREHUB S, COHEN A, THORPE L, et al. Development of the Perception of Musical Relations:Semitone and Diatonic Structure[J]. Journal of Experimental Psychology Human Perception and Performance, 1986, 12(3):295-301.
doi: 10.1037/0096-1523.12.3.295 |
[28] | HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition[C].// 2016 IEEE Conference on Computer Vision and Pattern Recognition.Piscataway:IEEE, 2016:770-778. |
[29] | SRIVASTAVA R K, GREFF K, SCHMIDHUBER. Training Very Deep Networks [C]. //Conference and Workshop on Neural Information Processing Systems,Advances in Neural Information Processing Systems 28.New York:Curran Associates, 2015:2377-2385. |
[30] | LARSSON G, MAIRE M, SHAKHNAROVICH G. FractalNet:Ultra-Deep Neural Networks without Residuals[C]//Proceedings of the Internatienal Conference on Learning Represemtations.Piscataway:IEEE, 2017:403-410. |
[31] | HUANG G, SUN Y, LIU Z, et al. Deep Networks with Stochastic Depth[C]//Proleedings of the European Conference on Computer Vision.Piscataway:IEEE, 2016:646-661. |
[1] | LV Wenkai,YANG Pengfei,DING Yunqing,ZHANG Heyu,ZHENG Tianyang. JEDERL:A task scheduling optimization algorithm for heterogeneous computing platforms [J]. Journal of Xidian University, 2021, 48(6): 67-74. |
[2] | YU Haoyang,YIN Liang,LI Shufang,LV Shun. Recognition algorithm for the little sample radar modulation signal based on the generative adversarial network [J]. Journal of Xidian University, 2021, 48(6): 96-104. |
[3] | HU Daiwang,JIAO Yiyuan,LI Yanni. Novel and efficient algorithm for entity relation extraction with the corpus knowledge graph [J]. Journal of Xidian University, 2021, 48(6): 75-83. |
[4] | SUN Yanjing,WEI Li,ZHANG Nianlong,YUN Xiao,DONG Kaiwen,GE Min,CHENG Xiaozhou,HOU Xiaofeng. Person re-identification method combining the DD-GAN and Global feature in a coal mine [J]. Journal of Xidian University, 2021, 48(5): 201-211. |
[5] | ZHANG Jiaqi,TAO Haihong,ZHANG Xiushe,HAN Chunlei. A multi-frame track before detect algorithm utilizing measurement space clustering [J]. Journal of Xidian University, 2021, 48(5): 231-238. |
[6] | ZHOU Peng,YANG Jun. Semantic segmentation of remote sensing images based on neural architecture search [J]. Journal of Xidian University, 2021, 48(5): 47-57. |
[7] | QIAN Zhihua,GAO Chenqiang,YE Sheng. Method for detection of a student’s pose in a multi-scene classroom based on meta-learning [J]. Journal of Xidian University, 2021, 48(5): 58-67. |
[8] | ZHANG Shuwei,LI Junmin. Human body detection algorithm in complex monitoring scenes [J]. Journal of Xidian University, 2021, 48(5): 68-77. |
[9] | YANG Yunhang,MIN Lianquan. Multi-scalefusion sketch recognition model by dilated convolution [J]. Journal of Xidian University, 2021, 48(5): 92-99. |
[10] | DONG Ruchan,JIAO Licheng,ZHAO Jin,SHEN Weiyan. Application of the deep fusion mechanism in object detection of remote sensing images [J]. Journal of Xidian University, 2021, 48(5): 128-138. |
[11] | MAO Zhaoyong,WANG Yichen,WANG Xin,SHEN Junge. Vehicle video surveillance and analysis system for the expressway [J]. Journal of Xidian University, 2021, 48(5): 178-189. |
[12] | CHENG De,HAO Yi,ZHOU Jingyu,WANG Nannan,GAO Xinbo. Cross-modality person re-identification utilizing the hybrid two-stream neural networks [J]. Journal of Xidian University, 2021, 48(5): 190-200. |
[13] | CHEN Changchuan,WANG Haining,HUANG Lian,HUANG Tao,LI Lianjie,HUANG Xiangkang,DAI Shaosheng. Facial expression recognition based on local representation [J]. Journal of Xidian University, 2021, 48(5): 100-109. |
[14] | SONG Jianfeng,MIAO Qiguang,WANG Chongxiao,XU Hao,YANG Jin. Multi-scale single object tracking based on the attention mechanism [J]. Journal of Xidian University, 2021, 48(5): 110-116. |
[15] | ZHANG Yuhao,CHENG Peitao,ZHANG Shuhao,WANG Xiumei. Lightweight image super-resolution with the adaptive weight learning network [J]. Journal of Xidian University, 2021, 48(5): 15-22. |
|