Journal of Xidian University ›› 2021, Vol. 48 ›› Issue (4): 168-175.doi: 10.19665/j.issn1001-2400.2021.04.022
• Computer Science and Technology & Cyberspace Security • Previous Articles Next Articles
WANG Yong1(),SU Zhuoyi1(
),ZHU Zhengyu1,2(
)
Received:
2020-06-01
Online:
2021-08-30
Published:
2021-08-31
Contact:
Zhengyu ZHU
E-mail:isswy@mail.sysu.edu.cn;364085901@qq.com;zhuzhengyu0701@163.com
CLC Number:
WANG Yong,SU Zhuoyi,ZHU Zhengyu. Detection of voice transformation spoofing using the dense convolutional neural network[J].Journal of Xidian University, 2021, 48(4): 168-175.
"
训练数据集 | 测试数据集 | 加入10 dB的噪声 | 加入15 dB的噪声 | 加入20 dB的噪声 | 加入30 dB的噪声 | 干净的语音 |
---|---|---|---|---|---|---|
Timit-1 | Timit-2 | 97.65 | 98.56 | 99.09 | 99.15 | 99.18 |
NIST-1 | NIST-2 | 91.97 | 91.97 | 93.82 | 96.07 | 96.86 |
UME-1 | UME-2 | 91.82 | 91.82 | 94.12 | 94.94 | 96.31 |
N-1&T-1 | UME-2 | 87.62 | 87.62 | 93.58 | 95.84 | 96.16 |
N-1&U-1 | Timit-2 | 90.21 | 90.21 | 95.06 | 95.91 | 96.22 |
T-1&U-1 | NIST-2 | 75.28 | 75.28 | 80.07 | 80.91 | 81.37 |
均值 | 89.09 | 89.09 | 92.62 | 93.80 | 94.35 |
[1] | PERROT P, AVERSANO G. Voice Disguise and Automatic Detection:Review and Perspectives[C]//Progress in Nonlinear Speech Processing.Berlin:Springer-Verlag, 2007:101-117. |
[2] | GOMEZ-ALANIS A, PEINADO A M, GONZALEZ J A, et al. A Gated Recurrent Convolutional Neural Network for Robust Spoofing Detection[J]. IEEE/ACM Transactions on Audio,Speech,and Language Processing (TASLP), 2019, 27(12):1985-1999. |
[3] | ALAM J, KENNY P. Spoofing Detection Employing Infinite Impulse Response—Constant Q Transform-Based Feature Representations[C]//Proceedings of the 2017 25th European Signal Processing Conference(EUSIPCO).Piscataway:IEEE, 2017:101-105. |
[4] | HANILCI C. Speaker Verification Anti-Spoofing Using Linear Prediction Residual Phase Features[C]//Proceedings of the 2017 25th European Signal Processing Conference(EUSIPCO).Piscataway:IEEE, 2017:96-100. |
[5] | KAMBLE M R, PATIL H. Novel Energy Separation Based Instantaneous Frequency Features for Spoof Speech Detection[C]//Proceedings of the 2017 25th European Signal Processing Conference (EUSIPCO).Piscataway:IEEE, 2017:106-110. |
[6] |
MUCKENHIM H, KORSHUNOV P, MAGIMAI-DOSS M, et al. Long-Term Spectral Statistics for Voice Presentation Attack Detection[J]. IEEE/ACM Transactions on Audio Speech and Language Processing, 2017, 25(11):2098-2111.
doi: 10.1109/TASLP.2017.2743340 |
[7] | DINKEL H, QIAN Y, YU K. Small-Footprint Convolutional Neural Network for Spoofing Detection[C]//Proceedings of the 2017 International Joint Conference on Neural Networks(IJCNN).Piscataway:IEEE, 2017:3086-3091. |
[8] |
SAHIDULLAH M, THOMSEN D A L, HAUTAMAKI R G, et al. Robust Voice Liveness Detection and Speaker Verification Using Throat Microphones[J]. IEEE/ACM Transactions on Audio Speech and Language Processing, 2018, 26(1):44-56.
doi: 10.1109/TASLP.2017.2760243 |
[9] | LEE K, PARK C, KIM N, et al. Accelerating Recurrent Neural Network Language Model Based Online Speech Recognition System[C]//Proceedings of the 2018 IEEE International Conference on Acoustics,Speech and Signal Processing.Piscataway:IEEE, 2018:5904-5908. |
[10] | SAILOR H, KAMBLE M, PATIL H. Auditory Filterbank Learning for Temporal Modulation Features in Replay Spoof Speech Detection[C]//Proceedings of the Interspeech.Piscataway:IEEE, 2018:666-670. |
[11] | KUMAR M G, KUMAR R S. Spoof Detection Using Time-Delay Shallow Neural Network and Feature Switching[C]//Proceedings of the 2019 IEEE Automatic Speech Recognition and Understanding Workshop(ASRU).Piscataway:IEEE, 2019:1011-1017. |
[12] |
GOMEZ-ALANIS A, GONZALEZ-LOPEZ A, PEINADO A M. A Kernel Density Estimation Based Loss Function and Its Application to ASV-Spoofing Detection[J]. IEEE Access, 2020, 8:108530-108543.
doi: 10.1109/Access.6287639 |
[13] |
BALAMURALI B T, LIN K, LUI S, et al. Toward Robust Audio Spoofing Detection:a Detailed Comparison of Traditional and Learned Features[J]. IEEE Access, 2019, 7:84229-84241.
doi: 10.1109/Access.6287639 |
[14] | KAMBLE M, PATIL H. Analysis of Reverberation via Teager Energy Features for Replay Spoof Speech Detection[C]//Proceedings of the 2019 IEEE International Conference on Acoustics,Speech and Signal Processing.Piscataway:IEEE, 2019:2607-2611. |
[15] | YE Y, LAO L, YAN D, et al. Detection of Replay Attack Based on Normalized Constant Q Cepstral Feature[C]//Proceedings of the 2019 IEEE 4th International Conference on Cloud Computing and Big Data Analysis.Piscataway:IEEE, 2019:407-411. |
[16] | NOSEK T, SUZIC S, PAPIC B, et al. Synthesized Speech Detection Based on Spectrogram and Convolutional Neural Networks[C]//Proceedings of the 2019 27th Telecommunications Forum.Belgrade:Serbia, 2019:1-4. |
[17] | ACHARYA R, PATIL H, KOTTA H. Novel Enhanced Teager Energy Based Cepstral Coefficients for Replay Spoof Detection[C]//Proceedings of the 2019 IEEE Automatic Speech Recognition and Understanding Workshop.Piscataway:IEEE, 2019:342-349. |
[18] | MALIK K M, JAVED A, MALIK H, et al. A Light-Weight Replay Detection Framework for Voice Controlled IoT Devices[J]. IEEE Journal of Selected Topics in Signal Processing,Early Access Article, 2020, 14(5):982-996. |
[19] | KAMBLE M R, KRISHNA SAI P A. Speech Demodulation-Based Techniques for Replay and Presentation Attack Detection[C]//Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference.Piscataway:IEEE, 2019:1545-1550. |
[20] | SINITCA A M, EFIMCHIK N V, SHALUGIN E D, et al. Voice Antispoofing System Vulnerabilities Research[C]//Proceedings of the 2020 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering,St.Piscataway:IEEE, 2020:505-508. |
[21] | MONTEIRO J, ALAM J, FALK T H. An Ensemble Based Approach for Generalized Detection of Spoofing Attacks to Automatic Speaker Recognizers[C]//Proceedings of the 2020 IEEE International Conference on Acoustics,Speech and Signal Processing.Piscataway:IEEE, 2020:6599-6603. |
[22] | WANG Y, DENG Y H, WU H J, et al. Blind Detection of Electronic Voice Transformation with Natural Disguise[C]//Proceedings of the Digital Forensics and Watermaking,LNCS 7809.Berlin:Springer-Varlag, 2013:336-343. |
[23] |
WU H, WANG Y, HUANG J. Identification of Electronic Disguised Voices[J]. IEEE Transactions on Information Forensics and Security, 2014, 9(3):489-500.
doi: 10.1109/TIFS.2014.2301912 |
[24] | WU H, WANG Y, HUANG J. Blind Detection of Electronic Disguised Voice[C]//Proceedings of the IEEE International Conference on Acoustics,Speech and Signal Processing.Piscataway:IEEE, 2013:3013-3017. |
[25] | LIANG H, LIN X, ZHANG Q, et al. Recognition of Spoofed Voice Using Convolutional Neural Networks[C]//Proceedings of the 2017 IEEE Global Conference on Signal and Information Processing(GlobalSIP).Piscataway:IEEE, 2017:293-297. |
[26] | LAROCHE J. Time and Pitch Scale Modification of Audio Signals[M]. Applications of Digital Signal Processing to Audio and Acoustics.Moscow:Kluwer Academic Publishers, 2002:279-310. |
[27] |
TREHUB S, COHEN A, THORPE L, et al. Development of the Perception of Musical Relations:Semitone and Diatonic Structure[J]. Journal of Experimental Psychology Human Perception and Performance, 1986, 12(3):295-301.
doi: 10.1037/0096-1523.12.3.295 |
[28] | HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition[C].// 2016 IEEE Conference on Computer Vision and Pattern Recognition.Piscataway:IEEE, 2016:770-778. |
[29] | SRIVASTAVA R K, GREFF K, SCHMIDHUBER. Training Very Deep Networks [C]. //Conference and Workshop on Neural Information Processing Systems,Advances in Neural Information Processing Systems 28.New York:Curran Associates, 2015:2377-2385. |
[30] | LARSSON G, MAIRE M, SHAKHNAROVICH G. FractalNet:Ultra-Deep Neural Networks without Residuals[C]//Proceedings of the Internatienal Conference on Learning Represemtations.Piscataway:IEEE, 2017:403-410. |
[31] | HUANG G, SUN Y, LIU Z, et al. Deep Networks with Stochastic Depth[C]//Proleedings of the European Conference on Computer Vision.Piscataway:IEEE, 2016:646-661. |
[1] | QU Jiahui, HE Jie, DONG Wenqian, LI Yunsong, ZHANG Tongzhen, YANG Yufei. Change detection method based on multi-scale and multi-resolution information fusion [J]. Journal of Xidian University, 2025, 52(1): 105-116. |
[2] | YANG Peng, ZHOU Yu, ZHANG Yujia, ZHANG Zhehao, ZHANG Shizhe. N-LOS dtection by the reconfigurable intelligent surface aided radar in an urban environment [J]. Journal of Xidian University, 2025, 52(1): 130-141. |
[3] | KONG Fanqiang, YU Shengjie, WANG Kun, FANG Xu, LV Zhijie. Hyperspectral image unmixing method based on convolutional recurrent neural networks [J]. Journal of Xidian University, 2025, 52(1): 142-151. |
[4] | CHEN Chen, CHENG Rong, SONG Bin. End-to-end heterogeneous graph information collaborative filtering for recommendation [J]. Journal of Xidian University, 2025, 52(1): 163-180. |
[5] | LI Linke, CHEN Jie, LIU Jun. Improved schemes and applications of the neural network differential distinguisher [J]. Journal of Xidian University, 2025, 52(1): 196-214. |
[6] | WANG Danyang, PIAO Chunying, LIU Qi, GUAN Lei, LI Zan. CTS features based electromagnetic interference identification at radio observatory site [J]. Journal of Xidian University, 2025, 52(1): 80-93. |
[7] | TANG Shuyuan, ZHOU Yiqing, LI Jintao, LIU Chang, SHI Jinglin. Dual attention pedestrian detector for occlusion scenario based on feature calibration [J]. Journal of Xidian University, 2024, 51(6): 25-39. |
[8] | CAI Gushun, LIU Jinhui, ZHANG Xindan, HUANG Zhao, WANG Quan. PINN-based method for solving DC operating points in nonlinear circuits [J]. Journal of Xidian University, 2024, 51(6): 91-103. |
[9] | SUN Feifan, ZHAO Yan. Image hashing combining adaptive grid descriptor and image energy [J]. Journal of Xidian University, 2024, 51(5): 136-148. |
[10] | WANG Xiaopeng, SHI Huan. Fall detection algorithm based on the improved YOLOv8 combined with key points [J]. Journal of Xidian University, 2024, 51(5): 149-164. |
[11] | LIN Lang, ZHAO Hongzhi, SHAO Shihai, TANG Youxi. Novel artificial noise generation and suppression method for unmanned aerial vehicle networking [J]. Journal of Xidian University, 2024, 51(5): 35-45. |
[12] | SHI Jiaqi, YANG Minglei, LIAN Hao, YE Zhou, XU Guanghui. EVD clutter suppression method based on the self-organizing neural network [J]. Journal of Xidian University, 2024, 51(5): 46-57. |
[13] | WANG Yulai, LIAO Xiaomin, HE Haiguang, YE Guojun. Knowledge graph assisted spectrum resource optimization algorithm for UAVs [J]. Journal of Xidian University, 2024, 51(5): 58-70. |
[14] | ZHANG Mingjin, ZHOU Nan, LI Yunsong. Smooth interactive compression network for infrared small target detection [J]. Journal of Xidian University, 2024, 51(4): 1-14. |
[15] | LI Xiaohan, YANG Yanbo, ZHANG Jiawei, LI Baoshan, MA Jianfeng. Graph neural network vulnerability detection for ethernet smart contracts [J]. Journal of Xidian University, 2024, 51(4): 139-150. |
|