Font Size: a A A

A Fast Approach To Detect Gene-gene Synergy And Its Application In Disease Classification

Posted on:2019-02-12Degree:MasterType:Thesis
Country:ChinaCandidate:P W XingFull Text:PDF
GTID:2404330596988611Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
Selecting informative genes from expression data has been useful for medical diagnosis and prognosis.The pathogenesis of complex human diseases(e.g.cancer,cardiovascular disease,cerebrovascular disease,etc.)is very complicated,which are often affected by the interactions of gene-gene and gene-environment,and its hereditary approach does not follow the Mendel’s law.Therefore,the individual gene ranking techniques such as t-test cannot provide insights into gene-gene interactions and cannot find the pathogenic genes.From the perspective of gene-gene interactions,it is of great significance to develop new information gene selection algorithms for medical diagnosis,treatment,and pathogenesis analysis.The classic model of gene-gene interaction effects was originally defined by John Watkinson et al.That is the expression level of each of two genes may be totally uncorrelated with cancer,and yet the pair of these two expression levels is also sufficient to distinguish health from cancer,because cancer occurs when the two genes are either both ‘on’ or ‘off’.And it gives the ideal gene-gene interaction pattern diagram.However,the proposed Dendrogram method which by John Watkinson et al does not detect genes with this interaction pattern.The Doublets method proposes to detect interaction effect genes by the use of genepair transformation methods and gives 4 pairs of matching patterns(Sum,Diff,Mul,Sign).However,the selected genes by Doublets have a strong individual effects,and cannot find the genes pairs which have classical interactions.The MIC3variables method can detect gene pairs that have interaction effects,but the algorithm involves optimization calculations for three variables.The complexity of the algorithm is high,the calculation speed is slow,and it cannot give the mathematical expression of the interaction genes,which limiting its application in disease classification.The Abs convert method based on t-test(individual-effects gene ranking method)can effectively detect the gene-gene interactions.The algorithm has low complexity and fast calculation speed,which is beneficial to the application of gene expression data with high dimensionality.After verification in the real gene expression dataset,it can effectively find out biologically significant gene pairs,which is obviously superior to other methods.Classifier cannot learn well if the input features contains many interaction gene pairs,the pair-wise genes should be converted into new variables prior to be used as input features for classifiers,for example,when the 10 gene pairs which have interactions in the input features we can obtain 95.5% accuracy if we converted the 10 gene pairs into 10 new vectors(based on 5-cross validation)before it were given to classifier,but we can only get 65.5% accuracy if we not convert it.The interaction pair-wise genes selected by Abs method were applied to classification as the input features.After the combination of individual effect genes and interaction genes,the prediction accuracy was significantly better than other methods.When the same number of genes are used in the input features,the prediction accuracy obtained by the feature subsets of interaction genes combined with individual genes was significantly better than that of pure individual genes,indicating that combining individually discriminant and interaction genes can improve prediction performance,for example,we get the average accuracy of 80.36% in three datasets if we take the 40 individual effect genes as input features for classifier,however,the accuracy raised to 85.78% if we take the 20 individual effect genes and 10 interaction gene pairs as input.It indicates that the predict accuracies of the combine individual and interaction genes better than the individual genes when using the same number of genes.
Keywords/Search Tags:Gene expression data, Complex disease, Gene-gene interaction, Individual effect gene, The classification and prediction of disease
PDF Full Text Request
Related items