Font Size: a A A

An Approach Of Epistasis Mining Based On Artificial Bee Colony Optimizing Bayesian Network

Posted on:2021-05-08Degree:MasterType:Thesis
Country:ChinaCandidate:C YangFull Text:PDF
GTID:2370330611483358Subject:Agricultural Information Engineering
Abstract/Summary:PDF Full Text Request
The genome-wide association analysis is a commonly used method for detecting genetic locus that affect phenotypic trait.It could be used to analyze related genetic mechanisms in complex diseases.However,this method mainly focuses on the detection of major genes,ignoring interaction between genes and genes,that is,epistasis.Bayesian network has unique advantages in the construction of causality,and it has been widely used in epistasis mining research.However,Bayesian network tends to fall into local optimum because it usually uses the local or random search strategy.This leads to some problems of Bayesian network,such as low learning efficiency,inability to handle large-scale nodes.In order to solve the above problems,this paper proposes an epistasis mining approach based on artificial bee colony algorithm optimizing Bayesian network(Bn Bee Epi).This work mainly includes the following three points:(1)In order to improve the quality of the initial network in Bn Bee Epi,we propose omb-Fast algorithm on the basis of the constraint-based Bayesian network structure learning method Fast-IAMB.In the Markov blanket expansion stage of omb-Fast,it mainly considers the influence of phenotypic trait on the gene locus.The omb-Fast adds SNP node to the Markov blanket according to the conditional mutual information between the phenotypic trait and SNP nodes.Then the Markov blankets can more accurately reflect the relationship between epistatic gene locus and the phenotypic trait.We use this method to quickly and accurately generate an initial network including SNP locus and phenotypic trait.(2)Bayesian network tends to fall into local optimization,it would affect the accuracy of network construction and epistasis mining.In order to solve this problem,we use artificial bee colony algorithm to optimize the Bayesian network search strategy,and propose Bn Bee Epi algorithm.Bn Bee Epi firstly uses omb-Fast method to generate the initial network,then uses three stages of collecting bees,observing bees and scouting bees to add,drop,and reverse the network structure.These three stages would help to find the global optimal network structure.In order to further reduce the false positive rate of epistasis mining,this work uses the mixed Bayesian network scoring method of BIC and MIT to calculate the fitness of the network structure(nectar source).At the same time,decomposable BIC scoring function is used to process the large-scale network structure.Then our method can handle larger-scale SNP locus epistasis dataset to some extent.(3)We analyze the efficiency,accuracy,F1-score,and false positive rate(FPR)of Bn Bee Epi and various epistasis mining algorithms by using the epistasis simulation data which generated by GAMETES software.The experimental results show that omb-Fast has fast running speed on the premise of ensuring accuracy.Although Bn Bee Epi takes more time,it has higher F1-score and lower FPR.Overall,Bn Bee Epi has better performance than other methods.In addition,we verify and analyze Bn Bee Epi using real AMD data.The epistatic locus detected by this method have good literature support.
Keywords/Search Tags:Epistasis, Artificial bee colony algorithm, Bayesian network, Markov blanket
PDF Full Text Request
Related items