西安电子科技大学学报 ›› 2021, Vol. 48 ›› Issue (4): 168-175.doi: 10.19665/j.issn1001-2400.2021.04.022

• 计算机科学与技术&网络空间安全 • 上一篇    下一篇



  1. 1.广东技术师范大学 网络空间安全学院,广东 广州 510665
    2.华南理工大学 音频、语音与视觉处理实验室,广东 广州 510641
  • 收稿日期:2020-06-01 出版日期:2021-08-30 发布日期:2021-08-31
  • 通讯作者: 朱铮宇
  • 作者简介:王 泳(1976—),男,副教授,E-mail: isswy@mail.sysu.edu.cn|苏卓艺(1995—),男,广东技术师范大学硕士研究生,E-mail: 364085901@qq.com
  • 基金资助:

Detection of voice transformation spoofing using the dense convolutional neural network

WANG Yong1(),SU Zhuoyi1(),ZHU Zhengyu1,2()   

  1. 1. School of Cyberspace Security,Guangdong Polytechnic Normal University,Guangzhou 510665,China
    2. Audio,Speech and Vision Processing Laboratory,South China University of Technology,Guangzhou 510641,China
  • Received:2020-06-01 Online:2021-08-30 Published:2021-08-31
  • Contact: Zhengyu ZHU



关键词: 语音变换欺骗, 安全, 检测, 神经网络


Voice transformation (VT) spoofing refers to the operations for hiding the speaker’s identity which change a speaker’s acoustic features by speech processing algorithms and result in extremely high false reject rates for automatic speaker recognition (ASR) systems.VT spoofing is implemented with a low cost and has been integrated in many audio editing tools,thus presenting serious threats to social security.However,the research on VT spoofing detection is still insufficient.Hence,in this paper we propose a dense convolutional neural network (DenseNet) based VT detection method for distinguishing spoofed voices and genuine ones.The proposed network consists of 135 layers in total.By maximizing the skip-layers,the data transmission can be enhanced,and both the deep and shallow edge features can be used for classification,so as to alleviate the degradation phenomenon and further to improve detection accuracy.Experimental results show that the detection accuracy with various spoofing factors is over 98%.

Key words: voice transformation spoofing, detection, security, neural network


  • TP39