Font Size: a A A

Contrastive Study And Optimization Of Differentially Methylated Loci Recognition Algorithm

Posted on:2018-07-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y SongFull Text:PDF
GTID:2334330512988946Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
DNA methylation is one of the most important epigenetic mechanisms, and participates in the pathogenic processes of many diseases. DNA methylation plays a major role in regulating gene expression, maintaining chromatin structure, gene imprinting, X chromosome inactivation and embryonic development and other biological processes. In recent years, differentially methylated loci in the genome have been reported and implicated in a number of different diseases, tissues and cell types,and are associated with gene expression levels. Therefore, identification of differentially methylated loci is one of the most critical and fundamental issues in dissecting the disease etiologies. This paper analyzes a variety of cancers and finds differential methylation loci for each cancer, providing evidence for early diagnosis of cancer.For this purpose , the main research results are as follows:Firstly, the existing statistical hypothesis test method is only selected in the statistically significant differences in the methylation loci, and the selected positions do not have a class to distinguish performance. In this paper, we propose a feature selection method based on machine learning. Specifically, it is Elastic Net regularization feature selection algorithm. The feature selection method can effectively solve the shortcomings of the above statistical hypothesis test method, and the feature selection method also takes into account the effect of the interaction between the features (loci) on the classification. This has a role in the discovery of multiple methylation positions that affect the occurrence of cancer.Secondly, in this paper, we propose a robust ensemble feature selection algorithm based on Elastic Net regularization for the problem of discrepancy in the selection of differential methylation loci in the process of optimizing the differential methylation loci recognition algorithm. In this paper, 13 kinds of cancer data were selected to analyze the algorithm feature selection stability. The results show that the algorithm is superior to the Elastic Net regularization feature selection algorithm in the feature selection robustness evaluation indexes (Jekard index) in the case of similar model classification performance.Thirdly, in contrast to the existing statistical hypothesis test methods, this paper uses the independent test set to test the classification performance between the differential methylation loci obtained by the algorithm and the differential methylation loci obtained by the two methods of FastDMA and RnBeads. The results show that the correct rate of our algorithm is higher than that of FastDMA and RnBeads. It can be seen that the generalization ability of the differential methylation loci selected by our algorithm is much higher than that of the two hypothesis test methods.At last, whether the differential methylation loci have actual biological significance?This paper presents a common analysis for a variety of cancers. In this paper, the differential methylation loci of cancer correspond to the gene, and 38 consensus genes were obtained, and 23 common genes were found to be directly related to cancer. Also,we found that there are 11 metabolic pathways, of which 9 metabolic pathways and cancer are directly related.This suggests that the differential methylation loci found in this paper are highly biologically significant.
Keywords/Search Tags:DNA Methylation, The Differential Methylation Loci, Elastic Net, Ensemble Feature Selection, Cancer Commonality
PDF Full Text Request
Related items