Font Size: a A A

Statistical Analysis And Theoretical Research Of MeDIP-seq And MRE-seq Data

Posted on:2014-01-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y ZhouFull Text:PDF
GTID:1260330425974823Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
In the last eight years, developing of DNA sequencing has revolutionized the field of genomics. The new sequencing tools make it possible to rapidly produce large amounts of sequence data and greatly reduced cost. This Next-Generation Sequencing (NGS) technology include Roche’s454, Illumina’s Genome Analyzer and ABI’s SOLiD, makes the whole genome sequencing and resequencing, transcript sequencing and gene expression quantification, DNA-protein interactions and the feasibility of DNA methy-lation is a unpredictable challenge.The next-generation sequencing technology is higher throughput and more accurate and cost-saving advantages than microarray technology. The next-generation sequenc-ing technology but also applied to the corresponding field of biology, such as DNA se-quencing (ChIP-seq), RNA sequencing (RNA-seq) and methylated sequencing (MeDIP-seq). The new data which generated by new technology are essentially different with microarray technologies data, the small sample size and large amount of a sample are the characteristics of new data, in particular, the new data signals are discrete. There-fore, we can not analysis data with the statistic methods of microarray technologies data, how to analysis the new next-generation sequencing data with statistic method will be a challenge.In order to detect different methylation level regions, the paper develop a novel idea. Methylated DNA immunoprecipitation followed by sequencing (MeDIP-seq), coupled with a complementary and low cost method, methylation-sensitive restriction enzyme sequencing (MRE-seq). A computational approach that integrates data from these two different but complementary assays and predicts methylation differences between sam-ples has been unavailable. Here, we present a novel integrative statistical framework M&M (for integration of MeDIP-seq and MRE-seq) that dynamically scales, normal-izes, and combines MeDIPseq and MRE-seq data to detect differentially methylated regions. Using sample-matched whole-genome bisulfite sequencing (WGBS) as a gold standard, we demonstrate superior accuracy and reproducibility of M&M compared to existing analytical methods for MeDIP-seq data alone. M&M leverages the comple-mentary nature of MeDIP-seq and MREseq data to allow rapid comparative analysis between whole methylomes at a fraction of the cost of WGBS. Comprehensive analysis of nineteen human DNA methylomes with M&M reveals distinct DNA methylation pat-terns among different tissue types, cell types, and individuals, potentially underscoring divergent epigenetic regulation at different scales of phenotypic diversity. We find that differential DNA methylation at enhancer elements, with concurrent changes in histone modifications and transcription factor binding, is common at the cell, tissue, and individ-ual levels, whereas promoter methylation is more prominent in reinforcing fundamental tissue identities.At last, in order to detect enrich methylated CpG sites for MeDIP-seq data, the paper developing a model based on single CpG and using EM algorithm to estimate the degree of each CpG site. Therefore, detect enrich methylated CpG sites.
Keywords/Search Tags:Next-generation sequencing(NGS), RNA-seq, normalization, MeDIP-seq, MRE-seq, methylation level, CpG, DMR, M&M
PDF Full Text Request
Related items