| The rapid development of high-throughput sequencing technology has promoted the expansion of genetic multi-omics data,facilitating the interpretation of the biological significance embedded in multi-omics data and providing data support for the study of complex diseases such as tumors.As one of the most threatening diseases to human health,the pathogenesis of cancer has been investigated from various aspects.Studies have shown that the causative factor of cancer is mainly genetic mutations,and based on this,the idea of driver and passenger genes has been derived.Researchers refer to mutated genes that are responsible for autonomous cell proliferation as driver genes and genes that have little or no effect on cancer development as passenger genes.Thus,mining cancer driver genes with oncogenic effects from a large number of insignificant passenger genes has become a current hot issue.To cope with this challenging problem,numerous prediction algorithms have been developed to identify driver genes.The two more common types of identification algorithms are those that predict genes based on their mutation frequencies in samples,and those that combine protein interaction networks to improve the accuracy of the algorithms.In order to better improve the prediction of driver genes,this paper proposes a new method(PairDriver)based on prioritizing paired mutant genes.The algorithm achieves effective identification of driver genes by organically combining somatic mutation data,protein-protein interaction networks and differential gene expression data.The main research work and innovations of this paper are as follows:(1)The processing of gene mutation data of previous algorithms is improved by considering the contribution of the same gene under different samples and the mutation of different genes in the same sample to cancer.(2)Considering that the expression of one mutated gene will affect the expression of other mutated genes,this paper uses the differential expression of genes in cancer samples and normal samples to construct differential expression networks.(3)In this paper,we find that there is a stronger tendency of linkage between driver genes in protein-protein interaction network(PPI network).For this reason,we consider the genes in PPI network and differential expression network in pairs,and design an impact score to quantify the degree of impact of paired genes,and finally split it into the impact score of individual genes and rank individual mutant genes according to this score.Applying this paper’s algorithm to a dataset of 10 common cancer types in the TCGA database,the results show that this paper’s algorithm has a great improvement in identifying driver genes compared with other classical prediction algorithms.The PairDriver method is expected to provide new theoretical guidance and technical support for the diagnosis and treatment of cancer. |