J4 ›› 2013, Vol. 40 ›› Issue (3): 14-19+94.doi: 10.3969/j.issn.1001-2400.2013.03.003

• 研究论文 • 上一篇    下一篇

利用两级时域联合的包层语音质量评价模型

江亮亮;杨付正;任光亮   

  1. (西安电子科技大学 综合业务网理论及关键技术国家重点实验室,陕西 西安  710071)
  • 收稿日期:2012-10-08 出版日期:2013-06-20 发布日期:2013-07-29
  • 作者简介:江亮亮(1988-),男,西安电子科技大学博士研究生,E-mail: lljiang@stu.xidian.edu.cn.
  • 基金资助:

    国家自然科学基金资助项目(60902081,60902052);高等学校学科创新引智计划资助项目(B08038)

Packet-layer model for voice quality assessment using  two-level temporal pooling scheme

JIANG Liangliang;YANG Fuzheng;REN Guangliang   

  1. (State Key Lab. of Integrated Service Networks, Xidian Univ., Xi'an  710071, China)
  • Received:2012-10-08 Online:2013-06-20 Published:2013-07-29

摘要:

针对相同丢包率下不同丢包模式对应的语音质量存在差异的情况,提出了一种能够反映丢包模式对语音质量影响的包层语音质量评价模型.首先通过分析数据包头获取编码速率和丢包位置等信息,在此基础上,结合静音检测技术及误码传播结果预测每一帧的质量;然后根据人的感知特性将语音序列自由划分为变长帧组,并联合各帧的质量得到帧组质量;最后,综合各帧组的质量得到语音序列的总质量.提出的模型在两级时域联合过程中,为失真严重的区域分配更大的权重,从而有效反映丢包模式对语音质量的影响.实验结果表明,相比于国际标准G.107中的E-model, 所提模型的评分与语音质量感知评估方法的评分相比,皮尔森相关系数平均提高了0.0129,同时均方根误差平均降低了0.0234.

关键词: 语音质量评价, 时域联合, 丢包, 服务质量

Abstract:

Aiming at the problem that the voice qualities corresponding to different packet loss patterns show significant differences at the same packet loss rate, a packet-layer model for voice quality assessment, which well reflects the effect of the packet loss patterns on the voice quality, is presented. First, the information about the coding bit-rate and packet loss is obtained by analyzing the packet header, on the basis of which the frame quality is measured with the further information about silence detection and error propagation. Then the voice sequence is divided into groups of frames (GOFs) with a variable length and a short-term temporal pooling method is employed to obtain the GOF quality. Finally, the overall voice quality is determined by the long-term temporal pooling of the GOF qualities. The proposed two-level temporal pooling scheme well describes the effect of different packet loss patterns on the voice quality since the strongest impairments are predominately emphasized. Experimental results show that the presented model can lead to an increment of about 0.0129 in the Pearson Correlation coefficient (PCC) and a decrement of about 0.0234 in the Root Mean Squared Error (RMSE) compared with the E-model in ITU-T recommendation G.107.

Key words: voice quality assessment, temporal pooling, packet loss, quality of service

中图分类号: 

  • TN912