西安电子科技大学学报 ›› 2019, Vol. 46 ›› Issue (1): 143-150.doi: 10.19665/j.issn1001-2400.2019.01.023

• • 上一篇    下一篇

融合语音信号和脑电信号的多模态情感识别

马江河,孙颖(),张雪英   

  1. 太原理工大学 信息与计算机学院,山西 太原 030024
  • 收稿日期:2018-05-29 出版日期:2019-02-20 发布日期:2019-03-05
  • 通讯作者: 孙颖
  • 作者简介:马江河(1992-),男,太原理工大学硕士研究生,E-mali: 1360370562@qq.com
  • 基金资助:
    国家自然科学基金(61371193);山西省青年科技研究基金(2013021016-2)

Multimodal emotion recognition for the fusion of speech and EEG signals

MA Jianghe,SUN Ying(),ZHANG Xueying   

  1. College of Information and Computer , Taiyuan University of Technology, Taiyuan 030024, China
  • Received:2018-05-29 Online:2019-02-20 Published:2019-03-05
  • Contact: Ying SUN

摘要:

为构造有效的情感识别系统,通过声音刺激分别诱发出高兴、悲伤、生气以及中性4种情感,并采集相应的语音信号和脑电信号。首先,利用相空间重构技术提取脑电信号和语音信号的非线性几何特征和非线性属性特征,并结合两者的基本特征分别实现情感识别;然后,通过构建基于限制玻尔兹曼机的特征融合算法,从特征层融合的角度实现多模态情感识别;最后,利用二次决策算法从决策融合的角度构建多模态情感识别系统。实验结果显示,从特征融合的角度构建的多模态情感识别系统相比语音信号和脑电信号情感整体识别率,分别提高1.08%和2.75%;从决策融合的角度构建的多模态情感识别系统相比语音信号和脑电信号情感整体识别率,分别提高6.52%和8.19%;决策融合相比特征融合构建的多模态情感识别系统整体识别效果更优。因此,融合语音信号和脑电信号等不同来源的情感数据可以构造出更有效的情感识别系统。

关键词: 语音信号, 脑电信号, 特征融合, 决策融合

Abstract:

To construct an effective emotion recognition system, the emotions of joy, sadness, anger and neutrality are induced by sound stimulation, and the corresponding speech and EEG signals are collected. First, this paper extracts the nonlinear geometric feature and nonlinear attribute feature of EEG and speech signals by phase space reconstruction respectively, and the emotion recognition is realized by combining the basic features. Then, a feature fusion algorithm based on the Restricted Boltzmann Machine is constructed to realize multimodal emotion recognition from the perspective of feature fusion. Finally, a multimodal emotion recognition system is constructed through decision fusion by using the quadratic decision algorithm. The results show that the overall recognition rate of the multimodal emotion recognition system constructed by feature fusion is 1.08% and 2.75% higher than that of speech signals and that of EEG signals respectively, and that the overall recognition rate of the multimodal emotion recognition system constructed by decision fusion is 6.52% and 8.19% higher than that of speech signals and that of EEG signals respectively. The overall recognition effect of the multimodal emotion recognition system based on decision fusion is better than that of feature fusion. A more effective emotion recognition system can be constructed by combining the emotional data of different channels such as speech signals and EEG signals.

Key words: speech signals, electroencephalo-graph signals, feature fusion, decision fusion

中图分类号: 

  • TP391.4