J4 ›› 2014, Vol. 41 ›› Issue (6): 155-159+194.doi: 10.3969/j.issn.1001-2400.2014.06.026

• 研究论文 • 上一篇    下一篇

一种自适应时移与阈值的DCT语音增强算法

张君昌;刘海鹏;樊养余   

  1. (西北工业大学 电子信息学院,陕西 西安  710129)
  • 收稿日期:2013-10-24 出版日期:2014-12-20 发布日期:2015-01-19
  • 通讯作者: 张君昌
  • 作者简介:张君昌(1969-),男,副教授,博士,E-mail:zhangjc@nwpu.edu.cn.
  • 基金资助:

    国家自然科学基金资助项目(60872159)

Speech enhancement method using self-adaptive time-shift and threshold discrete cosine transform

ZHANG Junchang;LIU Haipeng;FAN Yangyu   

  1.  (School of Electronic Information, Northwestern Polytechnical University, Xi'an  710129, China)
  • Received:2013-10-24 Online:2014-12-20 Published:2015-01-19
  • Contact: ZHANG Junchang

摘要:

针对现有语音增强方法在低信噪比下性能降低的问题,提出了一种自适应时移与阈值的离散余弦变换语音增强算法.首先,对软阈值函数进行改进,既能消除噪声主导帧中的噪声,也能消除语音主导帧中的噪声,并依据信噪比自适应地选择阈值,较大程度地保留了语音的原始特征.其次,依据基音周期自适应地选择分析窗时移,降低了固定分析窗时移产生的白噪声,并且引入短时自相关函数和短时平均幅度差函数相结合的加权自相关函数,来进行基音周期的检测,提高了基音周期检测的准确性和对噪声的鲁棒性.理论分析与仿真结果表明,该算法在信噪比低至-5dB噪声环境下,相比现有的经验模态分解算法和子空间算法,输出信噪比有较大提高,鲁棒性更好.

关键词: 语音增强, 离散余弦变换, 自适应阈值, 自适应时移, 加权自相关函数

Abstract:

In view of the limitation of the existing speech enhancement methods under a low SNR, this paper proposes a speech enhancement method using self-adaptive time-shift and threshold discrete cosine transform. First, with the improved soft-threshold function to deal with the discrete cosine transform coefficients, we can not only eliminate the noise of noise-dominant frames, but also eliminate the noise of signal-dominant frames; the threshold is also selected self-adaptively based on the SNR, which can largely retain the original characteristics of the speech.Secondly, the shift of the analysis window is self-adapted according to the pitch period, reducing the white noise of the fixed window-shift.And a weighted autocorrelation function is introduced for pitch detection combined by the short-time autocorrelation function and the short-time average magnitude separation function, improving the precision of pitch detection and robustness to noise. Theoretical analysis and simulation results show that the output SNR of this method has increased greatly and the robustness to noise is better when the input SNR is as low as -5dB, compared with the empirical mode decomposition algorithm and the subspace algorithm.

Key words: speech enhancement, discrete cosine transform, adaptive threshold, adaptive time-shift, weighted autocorrelation function