J4 ›› 2015, Vol. 42 ›› Issue (5): 120-124+160.doi: 10.3969/j.issn.1001-2400.2015.05.021

• 研究论文 • 上一篇    下一篇

针对非平衡数据分类的新型模糊SVM模型

蔡艳艳;宋晓东   

  1. (北京航空航天大学 经济管理学院,北京  100191)
  • 收稿日期:2014-09-29 出版日期:2015-10-20 发布日期:2015-12-03
  • 通讯作者: 宋晓东
  • 作者简介:蔡艳艳(1976-),女,北京航空航天大学博士研究生,E-mail: caiyanyan@buaa.edu.cn.
  • 基金资助:

    国家自然科学基金重点资助项目(70821061)

New fuzzy SVM model used in imbalanced datasets

CAI Yanyan;SONG Xiaodong   

  1. (School of Economics and Management, Beihang Univ., Beijing  100191, China)
  • Received:2014-09-29 Online:2015-10-20 Published:2015-12-03
  • Contact: SONG Xiaodong

摘要:

提出了一种新的模糊支持向量机模型——非平衡数据分类的支持向量机模型,通过改进惩罚函数,降低模型对于含有噪声点的非平衡样本数据的敏感性,并采用网格搜索算法来确定各个支持向量机模型中参数的优化取值.研究结果表明,非平衡数据分类的支持向量机模型对非平衡样本数据进行分类的效果优于其他方法,不仅总体判别精度较高,也提高了少数类样本的判别精度,取得了较好的改进效果.

关键词: 支持向量机, 分类, 非平衡数据集, 噪声, 惩罚函数

Abstract:

The paper proposes a new fuzzy SVM, called CI-FSVM(Class Imbalance Fuzzy Support Vector Machine) short for which is based on imbalanced datasets classification. By improving penalty functions, we reduce the sensitivity of the model for imbalanced datasets with “overlap”. In addition, the parameters in SVM models are optimized by the grid-parameter-search algorithm. The results show that the CI-FSVM has a better effect in imbalanced datasets classification compared with other models. It not only has a higher overall accuracy, but also improves are judgment accuracy when dealing with the minority classifications.

Key words: support vector machine, classification, imbalanced datasets, noise samples, penalty function