Journal of Xidian University ›› 2021, Vol. 48 ›› Issue (4): 168-175.doi: 10.19665/j.issn1001-2400.2021.04.022

• Computer Science and Technology & Cyberspace Security • Previous Articles     Next Articles

Detection of voice transformation spoofing using the dense convolutional neural network

WANG Yong1(),SU Zhuoyi1(),ZHU Zhengyu1,2()   

  1. 1. School of Cyberspace Security,Guangdong Polytechnic Normal University,Guangzhou 510665,China
    2. Audio,Speech and Vision Processing Laboratory,South China University of Technology,Guangzhou 510641,China
  • Received:2020-06-01 Online:2021-08-30 Published:2021-08-31
  • Contact: Zhengyu ZHU E-mail:isswy@mail.sysu.edu.cn;364085901@qq.com;zhuzhengyu0701@163.com

Abstract:

Voice transformation (VT) spoofing refers to the operations for hiding the speaker’s identity which change a speaker’s acoustic features by speech processing algorithms and result in extremely high false reject rates for automatic speaker recognition (ASR) systems.VT spoofing is implemented with a low cost and has been integrated in many audio editing tools,thus presenting serious threats to social security.However,the research on VT spoofing detection is still insufficient.Hence,in this paper we propose a dense convolutional neural network (DenseNet) based VT detection method for distinguishing spoofed voices and genuine ones.The proposed network consists of 135 layers in total.By maximizing the skip-layers,the data transmission can be enhanced,and both the deep and shallow edge features can be used for classification,so as to alleviate the degradation phenomenon and further to improve detection accuracy.Experimental results show that the detection accuracy with various spoofing factors is over 98%.

Key words: voice transformation spoofing, detection, security, neural network

CLC Number: 

  • TP39