西安电子科技大学学报 ›› 2019, Vol. 46 ›› Issue (6): 81-87.doi: 10.19665/j.issn1001-2400.2019.06.012

• • 上一篇    下一篇

基因调控网络中的癌症标记物预测方法

覃桂敏,刘佳妍,殷雨,杨璐琼   

  1. 西安电子科技大学 计算机科学与技术学院, 陕西 西安 710071
  • 收稿日期:2019-05-22 出版日期:2019-12-20 发布日期:2019-12-21
  • 作者简介:覃桂敏(1977—),女,副教授,博士, E-mail:gmqin@mail.xidian.edu.cn
  • 基金资助:
    陕西省自然科学基金(2017JM6038)

Method of cancer biomarker prediction in the gene regulatory network

QIN Guimin,LIU Jiayan,YIN Yu,YANG Luqiong   

  1. School of Computer Science and Technology, Xidian University, Xi’an 710071, China
  • Received:2019-05-22 Online:2019-12-20 Published:2019-12-21

摘要:

基于多组学的癌症标记物识别对癌症分子机制的研究具有重要的意义,但是当前大部分工作都是结合蛋白质相互作用数据进行的,故提出一种新型的基于多基因调控网络和多组学数据的研究方法,用于分析癌症的分子机制以及预测生物分子标记物。该方法首先整合多组学数据,以胃癌和食管癌为例,分别构建了胃癌和食管癌的癌症特异性网络;然后在这两个网络上进行加权共表达网络分析,采用层次聚类划分模块,计算模块的第一主成分和所有已知癌症标记物的关系,以此为据筛选出癌症特异的模块;最后再提取疾病特异的生物通路,使用相似性评估方法识别潜在的癌症标记物。实验结果表明,该方法预测的特异性模块具有功能特性,并且在模块内部使用皮尔逊相关系数法进行预测的结果更准确。

关键词: 癌症, 基因共表达网络, 基因表达调控, 多组学数据

Abstract:

Cancer biomarkers identification based on multi-omics data is of great significance for the study of molecular mechanisms of cancer, while most of the current work is based on protein-protein interaction data. Therefore, a new method based on the gene regulatory network and multi-omic data is proposed to analyze cancer molecular mechanisms and predict cancer biomarkers. Taking stomach adenocarcinoma (STAD) and esophageal carcinoma (ESCA) for example, first we integrate multi-omics data to construct cancer-specific networks for STAD and ESCA respectively. Then, analysis of weighted co-expression gene networks is carried out on the two networks, and hierarchical clustering modules are used to calculate the relationship between the first principal component of the module and all known cancer biomarkers. Furthermore, cancer-specific modules are screened out. Finally, disease-specific biological pathways are extracted, and potential cancer biomarkers are prioritized using similarity assessment methods. Experimental results show that the specific module predicted has functional characteristics, and that the Pearson correlation coefficient method is more accurate.

Key words: cancer, gene co-expression network, gene expression regulation, multi-omics data

中图分类号: 

  • TP301