西安电子科技大学学报 ›› 2016, Vol. 43 ›› Issue (3): 43-48.doi: 10.3969/j.issn.1001-2400.2016.03.008

• 研究论文 • 上一篇    下一篇

富集分析框架下的致病SNP位点识别

杨利英;殷黎洋;袁细国;张军英   

  1. (西安电子科技大学 计算机学院,陕西 西安  710071)
  • 收稿日期:2015-01-21 出版日期:2016-06-20 发布日期:2016-07-16
  • 通讯作者: 杨利英
  • 作者简介:杨利英(1974-),女,副教授,E-mail:yangliying1208@163.com.
  • 基金资助:

    陕西省自然科学基金资助项目(2015JM6275);国家自然科学基金资助项目(61201312);中央高校基本科研业务费专项资金资助项目(K5051303017;JB140306)

Identifying pathogenic SNP loci by enrichment analysis

YANG Liying;YIN Liyang;YUAN Xiguo;ZHANG Junying   

  1. (School of Computer Science and Technology, Xidian Univ., Xi'an  710071, China)
  • Received:2015-01-21 Online:2016-06-20 Published:2016-07-16
  • Contact: YANG Liying

摘要:

针对复杂疾病致病单核苷酸多态性位点识别中单一方法的片面性问题,提出了基于富集分析的致病单核苷酸多态性位点识别方法.通过富集分析机制设计了一种集成学习框架,可将不同的方法有机结合以提升学习性能.基于此组合框架,将ReliefF和CA趋势检验进行了集成,在识别单个致病位点的同时兼顾位点之间的交互作用.在模拟数据集和真实数据集上进行了实验研究,结果表明所提出的方法能显著地提升致病单核苷酸多态性位点的识别性能,且所设计的组合框架具有良好的扩展性,可为其他方法的组合研究提供借鉴.

关键词: 模式识别, 集成学习, 交互作用, 富集分析, 致病SNP位点识别

Abstract:

Aiming at the recognition of pathogenic SNP loci for complex diseases, this paper proposes an ensemble learning frame via the enrichment analysis mechanism, which can combine different approaches efficiently. Based on the proposed frame, Relief-F and CA trend testing are combined to identify disease-related SNP loci. The new approach can identify not only the single pathogenic site, but also the interaction between the locus at the same time. Experiments have been carried both on simulated data and on real data. Experimental results show that the proposed approach can significantly improve the recognition performance of pathogenic SNP loci for complex diseases. The proposed ensemble learning framework could provide reference for combining different approaches.

Key words: pattern recognition, ensemble learning, interaction, enrichment analysis, recognition of pathogenic single nucleotide polymorphisms loci