Font Size: a A A

A Bioinformatics Study On Hypoxia Tolerance Or Susceptibility Gene

Posted on:2018-01-30Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhangFull Text:PDF
GTID:2348330521450972Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the profound and detailed understanding of gene,gene therapy has become a new strategy for disease treatment such as the study on hypoxia tolerance and susceptibility is used to cure the diseases caused mainly by hypoxia.Also with the extensive applications of DNA microarray technology,gene expression data for future analysis and prediction are obtained from collected cells that have been processed by DNA microarray technology.The numerousness of gene expression data in contrast with the extreme limitation of related genes with known function has made cluster analysis the main current approach to analyze the gene expression data.The core idea of the clustering algorithm is to classify the data sets by means of similarity measures which makes genes with similar or related functions classified into a common cluster and helps us to predict genes with unknown function by those with known function.In the thesis,the related data used in hypoxia tolerance research are integrated,analyzed,design and establish a small local database.And a clustering algorithm that combines hierarchical clustering and K-means clustering is proposed to make data analysis.Finally,the part of human genes which may participate in hypoxia adaptation is predicted through the experiment.The main work is as follows:(1)Hypoxia tolerance fruit fly specially cultivated in Haddad laboratory was chosen as the object of the study.As the project demands,data from a number of other public biological databases need to be obtained as well,including gene data,GO data,gene orthology data and so on.Thus,in order to facilitate the information acquisition,processing and analysis,the obtained data are to be firstly cleaned,integrated and analyzed,with their specific meaning fully understood.Based on the data analysis above and the 3NF database design philosophy,the database structure and table entity attributes are designed to establish a small dedicated database for subsequent analysis and processing locally.Meanwhile,the related software tools are designed to process and import related data.(2)In this thesis,a clustering algorithm that combines hierarchical clustering and K-means clustering is proposed through the comparison between the two commonly used clustering algorithms of hierarchical clustering and K-means clustering,and the analysis of the four kinds of connection method in the hierarchical clustering.Then,a comparative analysis is performed on the all kinds of clustering algorithm through the FOM measurement.According to the measurement results,it is manifested that the clustering algorithm which combines hierarchical clustering and K-means clustering shows the best performance.Thus,the results obtained from the algorithm are used as the basis for subsequent analysis.(3)The appropriate number of clusters is predicted through observation the inflection points according to the results of the FOM measurements and the relevant genomes related to the hypoxia tolerance are identified from all clusters by the two known genes associated with hypoxia tolerance.Finally,according to drosophila and human orthology gene analysis,the genes related to the hypoxia tolerance are discovered which may be involved in hypoxia adaption in human beings.In the thesis,aiming at analyzing the part of human genes related to hypoxia tolerance,a small local database is designed to analyze and integrate experimental data.In addition,a new clustering algorithm is proposed which is proved better than the traditional clustering algorithms by experiments.Finally,the algorithm above in this thesis predicts the part of human genes related to hypoxia tolerance.These genes provide a research direction for the disease related to hypoxia tolerance in gene therapy.
Keywords/Search Tags:Hypoxia Tolerance, Database, Gene Express Data, Clustering Analysis, FOM Measurement
PDF Full Text Request
Related items