J4 ›› 2014, Vol. 41 ›› Issue (3): 192-195+220.doi: 10.3969/j.issn.1001-2400.2014.03.029

• 研究论文 • 上一篇    下一篇

融合Burg谱估计与信号变化率测度的语音端点检测

张君昌;胡海涛;崔力   

  1. (西北工业大学 电子信息学院,陕西 西安  710129)
  • 收稿日期:2013-07-15 出版日期:2014-06-20 发布日期:2014-07-10
  • 通讯作者: 张君昌
  • 作者简介:张君昌(1969-),男,副教授,博士,E-mail: zhangjc@nwpu.edu.cn.
  • 基金资助:

    陕西省自然科学基金资助项目(2011JQ8038)

Robust voice endpoint detection fusing Burg spectrum estimate and signal variability

ZHANG Junchang;HU Haitao;CUI Li   

  1. (School of Electronic Information, Northwestern Polytechnical Univ., Xi'an  710129, China)
  • Received:2013-07-15 Online:2014-06-20 Published:2014-07-10
  • Contact: ZHANG Junchang

摘要:

针对现有基于特征的语音端点检测方法在低信噪比及非平稳噪声下检测性能较低的问题,提出了一种融合Burg谱估计与长时段信号变化率测度(LTSV)的语音端点检测方法.该方法采用表征较长时段语音变化率的LTSV参数,较准确地反映了语音的非平稳程度.与传统基于特征的语音端点检测方法相比,该方法在低信噪比及非平稳噪声情况下的检测性能有了较大提高.并融合Burg谱估计,与传统Welch谱估计方法相比,提高了LTSV参数的区分度,从而进一步提高了检测的准确率.仿真结果表明:采用融合Burg谱估计与LTSV的语音端点检测方法在低信噪比(-10dB)及非平稳噪声情况下,与传统基于特征的语音端点检测方法相比,检测准确率普遍提高了约6%以上,说明该方法在低信噪比及非平稳噪声环境下鲁棒性更好.

关键词: 语音端点检测, 信号变化率测度, Burg谱估计, 低信噪比, 非平稳性

Abstract:

Voice Endpoint Detection is challenging, especially in nonstationary noise and a low signal-to-noise ratio(SNR), so this paper proposes a novel Robust Voice Endpoint Detection method fusing Burg spectrum estimate and long-term signal variability(LTSV). This method uses a novel long-term signal variability measure, by which the degree of nonstationarity in various signals can be indicated. Comparison with the traditional Voice Endpoint Detection method based on signal features, this method's detection performance has been greatly improved under the condition of a low signal-to-noise ratio and nonstationary noise. Also, Burg spectrum estimate is proposed, which improves the LTSV parameter discrimination degree, thus further improving the detection accuracy. Simulation results show that in comparison with the standard Voice Endpoint Detection method, the new method's accuracy is generally improved by more than about 6%, which shows that the new method has better robustness in the non-stationary noise and low signal-to-noise ratio environment.

Key words: voice endpoint detection, long-term signal variability measure, Burg spectrum estimate, low signal-to-noise ratio, nonstationarity