Font Size: a A A

Construction Of The Gene Regulatory Network Based On The Modified SVM And Realization Of Spark

Posted on:2018-03-09Degree:MasterType:Thesis
Country:ChinaCandidate:D L DingFull Text:PDF
GTID:2310330536957329Subject:Engineering
Abstract/Summary:PDF Full Text Request
The research and construction of the gene regulatory networks are the crucial topic in bioinformatics,and the regulatory mechanism of the gene expression plays an important role in understanding the biological processes and the mechanism of the occurrence of diseases.At the same time,the rapid development of the microarray technology provides a strong guarantee in data for the research of the gene regulatory network.The machine learning and the big data platform Spark have become the effective solution to construct the gene regulatory networks.In the face of the massive biological gene sequences of the data,the traditional gene identification technology has many defects,such as the expensive cost,the complex principle,the poor repeatability and the long time period,therefore this gene chip technology can't meet the demand of the research and the machine learning methods and the big data mining platform Spark have become the new method for the bioinformatics research.This paper adopted the method of improved support vector machine(SVM)and the big data mining platform Spark,combined with the known data of transcription factors in order to construct the gene regulatory networks to solve the bioinformatics prediction in whole genome.This paper constructed a gene regulatory network model based on the improved SVM to predict the transcription factors of the arabidopsis thaliana in ATGen Express database,and the recognition rate is as high as 93%.The gene regulatory network also predicted the relationship between some unknown transcription.In the meanwhile,the experiments shown that the operation time of the gene regulatory network model deployed to the big data processing platform Spark increased by about 7 times compared with the previous stand-alone mode.The predicted results are more than the previous differential equations and the clustering analysis algorithm from the accuracy and the time efficiency in this paper.In the future,we can clear which genes work together is the pathogenic root by constructing the complete gene regulatory network in order to find the theoretical support for diagnosis and treatment.
Keywords/Search Tags:Gene regulatory network, Improved support vector machine, Spark, Transcription factors
PDF Full Text Request
Related items