Font Size: a A A

Research On Inferring Algorithms Of Signaling Pathways

Posted on:2012-06-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y W LiFull Text:PDF
GTID:1100330332999400Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Systems Biology is the forefront and the most challenging subject in life science of the 21stcentury. The research emphases of this subject focus no longer on the structure and function ofindividual molecules, but on the large quantities of the complex interaction relationship among allkinds of biological molecules. These multilevel complex networks formed by interaction data containa wealth of knowledge about operation mechanism of life system. Mining the potential knowledgebecomes one of the main tasks of Systems Biology currently. In view of the complexity of biologicalsystems and the restrictions of current technical level, people still cannot complete to reconstruct theentire biological network system, so inferring network on local level is much appreciated. Pathwaysare primary forms of biological network and the basic constitute of the multilevel network, pathwayreconstruction, therefore, has become one of the hotspots in Systems Biology recently.A pathway consists of a series of biological molecules which execute the same biological processtogether and the interactions among these molecules. Three types of pathways are metabolic pathways,signaling pathways and regulation pathways. These pathways play most important roles in the processof cell proliferation, differentiation, metabolism and apoptosis. Research on methods of inferringpathways may contribute to understand the biological processes and the early diagnosis of disease.The research objective of this paper is to explore the pathway of certain species by computationmethods. In consideration of the great capacity, the high level of noise, the nonlinear of biological data,machine learning algorithms are choosed as the calculating method for its suitable for dealing withsuch data. On the basis of theoretical study and discussion of the commonly used machine learningalgorithms, a series of new methods of pathway prediction are proposed. They are (1) Inferringtopological structure of pathways use the statistical model and EM algorithm, (2) Inferring dynamicsmodel of pathways using differential equation model and the particle swarm optimization algorithmand, (3) Identifying sequence motif by means of evolutionary computation. Inferring topologicalstructure of pathways means to reveal the interactions of biological molecules qualitatively, whileinferring dynamics model of pathways means to reflect the space-time characteristics during a specificlife process quantitatively. Each of the new proposed methods may provide a meaningful reference fordifferent demand of biological research.The main contributions and contents are described as follows:(1) Give a brief introduction of Systems Biology and prediction of biological network, sum upresearches on methods of inferring biological pathway. Present the research background, applications,research status, challenges and the trend of development of pathway prediction algorithms. Thesediscussions and analyses provide necessary target and direction for new algorithm research. (2) Discuss and analyze the common used statistical models and machine learning algorithms inSystems Biology. The specific items are as follows: the probability theory of maximum likelihoodparameter estimation and Bayes statistical parameter estimation, the characteristics and applications ofMarkov chains, the probability theory of Hidden Markov model (HMM) and the evaluation problem,decoding problem, parameter estimation problem and their solutions, the principle and workingprocess of EM algorithm, the ideas of solving the optimization problems that contains missing datawith EM algorithm, the mathematical foundation and realizing technology of genetic algorithm andparticle swarm optimization, the skills and improving methods of solving parameter optimizationproblems use these two kinds of evolutionary algorithms. These discussions and analyzes provide thesolid theoretical basis for implementation of the research objective.(3) Due to the fact that the pathway topology inferring methods may generate large error in caseof using microarray data alone, and the limitations of large amount of calculation, excessivelycomplicated and hard to implement as well, we propose a new pathway topology inferring method thatis simple and intuitive used and can be mixture of various data forms such as biological experimentdata, literature retrieval results, expert knowledge, etc.. Firstly, we establish a Markov chain model bytaking the regulation probability of gene pairs as the state transition matrix. Thus the samples can beviewed as the Markov processes which are sampled from the same Markov model. Secondly,according to the definition of Markov model, we construct the likelihood function depend on whichthe sample are produced from the new established Markov chain model. Finally, by the restraint thatthe value of the likelihood function must be maximized we get the estimation of model parameters.Following EM algorithm this algorithm can skillfully solve the difficult problems which containingmissing data through the expectations and maximization iterations. The results of MAPK/Erk pathwaytopology inferring experiment shows the effectiveness of our method, and it also shows that theaccuracy will be significantly increased by introducing priori information during the modelinitialization process.(4) Propose a new method based on HMM to predict the regulation direction due to the fact thatmost pathway topology inferring methods only focus on the regulation relation. This method makes upfor the deficiency of any others and makes the results more meaningful. A group of signalingpathways reconstruction experiments shows the effectiveness of our method.(5)Propose a new method based on the particle swarm optimization to infer the dynamics modelof pathway due to the complexity of the model and the difficulty of parameter estimation of manycurrent methods. We firstly choice differential equation as the dynamics model of the pathway basedon the existing chemical reactions and molecular interactions. Then we apply particle swarmalgorithm to estimate the dynamic parameters of the model. The proposed method can obtain theglobal optimal solution. So it is better than HJA algorithm which can only get local optimal solutions.Moreover, it converges more quickly than any other methods which based on gradient descentalgorithm or genetic algorithm. A metabolic pathway dynamics simulation experiments shows the effectiveness of the proposed algorithm, and also reveals the practical in solving nonlinear constraintoptimization problems and the parameter estimation problems.(6) Propose a new method based on genetic algorithm to identify the sequence motifs due to thefact that some methods based on heuristic such as Gibbs sampling and MEME may suffer from the bigcomputational cost, easily fall into the local minimum and low accuracy in predicting etc.. In thismethod we select position weights matrix as the model of motif and change them into thechromosome code of genetic algorithm. Then a series of improvements are preformed during somegenetic operation processes: Firstly, the biological characteristic that some bases may appear as agroup is involved in the fitness evaluation, and small number of bases are allowed to mutate in acertain motif. This may improve the prediction precision to some extent. Secondly, all individuals ofthe initial population are not generated randomly, and a few individuals may be generated from theresults of multi-sequence alignment, so the convergence speed of the new algorithm can beaccelerated. Thirdly, a new selection operator based on fitness and individual concentration isproposed, which may overcome the immature convergence problem that the genetic algorithms alwayssuffer from. This algorithm is applied in predicting motifs from artificially sequences, 12 promotersequences of RAP1 co-regulated genes in yeast, 18 promoter sequences of CRP co-regulated genes inEscherichia coli. The effectiveness is proved by the results of three groups of experiment.The achievement of this paper provides more meaningful methods and ways in solving thepathway inferring problems and more advice and help for the designation of biological experiments.In addition, our work enriched the application research of machine learning theories in field ofprobability analysis, parameter optimization and evolutionary computation, etc.. Although we havedone some exploration work in pathway prediction research and made some achievements, thesemethods may just a drop in the ocean in recovering complex pathway on systematic level. Manyproblems such as inferring pathways containing feedback loop and predicting the crosstalk betweenpathways need solving in a better way. These will be my future research objectives.
Keywords/Search Tags:pathway, Bioinformatics, Systems Biology, motif recognition, statistics learning, machine learning, HMM model, Markov model, EM algorithm, Genetic algorithm, particle swarm optimization
PDF Full Text Request
Related items