  1. (1. 西安电子科技大学 计算机学院,陕西 西安  710071;
    2. 西安电子科技大学 理学院,陕西 西安  710071)
  收稿日期:2011-05-03 出版日期:2012-02-20 发布日期:2012-04-06
Method for extracting the tumor gene based on  the support vector machine

QIN Chuandong1;LIU Sanyang2;ZHANG Shifang1

  1. (1. School of Computer Science and Technology, Xidian Univ., Xi'an  710071, China;
    2. School of Science, Xidian Univ., Xi'an  710071, China)
  Received:2011-05-03 Online:2012-02-20 Published:2012-04-06
根据结肠癌肿瘤基因表达谱样本高维数、小样本和高噪声的特点,提出用Bhattacharyya 距离对肿瘤基因进行测量,滤除分类无关基因,然后用肿瘤基因对支持向量机模型的敏感度进行二次提取.并用它的归一化值对重要基因赋权,形成只有少数重要致病肿瘤基因的新样本集.最后,支持向量机应用于对新样本集的特征基因进行分析与测试.实验证明这种分析方法提高了肿瘤诊断的准确率.

关键词: 基因表达谱, Bhattacharyya距离, 灵敏度, 支持向量机, 4-折交叉验证方法


According to the characteristics of the colon cancer gene expression profiles with high dimension, small sample and great noise,a method is proposed to measure the tumor gene with the Bhattacharyya distance and remove the genes irrelevant to the classification task. The method extracts the tumor gene for the second time by utilizing the sensitivity of the tumor gene on the model. Simultaneously, a weight is added to the important genes depending on the normalization of the sensitivity and a new sample dataset is built. Finally a support vector machine is used to analyze and test the feature genes on the new sample dataset. Experimental results show that this method improves the accuracy of tumor diagnosis.

Key words: gene expression profiles, Bhattacharyya distance, sensitivity, support vector machine, four-fold cross-validated method


