Font Size: a A A

The Research For Predicting Three-dimensional Structures Of Chromosomes Based On Hi-C Data

Posted on:2017-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:W ZhangFull Text:PDF
GTID:2180330503992764Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
3D chromosome reconstruction, i.e., the chromosome 3D structure prediction, is concerned that how we reconstruct three-dimensional(3D) structures of chromosome from their contact frequency data in 2D. Chromosome structure plays important roles in gene regulation, DNA replication and maintenance of genome stability. Hi-C technology, a chromosome conformation capture based method, has been developed to capture genome-wide interactions which are further processed and generate contact frequency matrix to reconstruct 3D chromosome structures. It is one of the core research areas in 3D genome that how predict the organization of genome with biological information and computing technology. Several methods have been proposed to obtain a consensus structure or ensemble structures. These methods can be categorized as probabilistic models or restraint-based models. The modeling methods will help for systematic study for 3D structures of chromosome and provide structure evidence for making comprehensive analysis of biological process.In this paper, we propose a Sh Rec3D+ method which is one of restraint-based models. We validate Sh Rec3D+ accuracy and efficiency on both simulation data and real Hi-C data. Next, we can apply the method to predict 3D genome structure. The main research contents and results are as follows:1. Indicate the shortcoming of Sh Rec3 D by experimental analysis. First, we introduce the principle of Sh Rec3 D and Chrom SDE methods, and further analyze their merits and faults. The two methods are based on restraint-based models. Second, we show how to construct three simulation datasets whose structures change from simple to complex. Hi-C data are from mouse ES cell(m ESC) and human GM06990 cell. Third, we validate Sh Rec3 D and Chrom SDE methods using both simulation data and Hi-C data. Our test results indicate that the accuracy of Sh Rec3 D depends on the conversion function which converts contact matrixs to spatial distance matrixs.2. Based on Sh Rec3 D, we propose a parameter-varying algorithm, named Sh Rec3D+, to infer a consensus structure. We first describles Sh Rec3D+ implementation process: firstly, use conversion function with a variable parameter to converting the contact frequency to the spatial distance. Then, apply shortest-path algorithm of graph theory to complete the missing distance values and use multi-dimensional scaling(MDS) algorithm(MDS) to find the best structure. Finally, use golden section search to find the correct parameter by repeating the above steps. Further, we validate Sh Rec3D+ accuracy and efficiency on both simulation data and Hi-C data. The results indicate that our method can well estimate a conversion parameter, whose values increases with the resolution increasing.3. We discuss the influence of Hi-C data with different normalization methods on prediction performance of Sh Rec3D+. First, we generate the simulation data of Yeast which is subject to Possion distribution. And, we introduce two correction methods: YT and Hi CNorm to remove the biases of Hi-C data from GM06990 cell. Second, we test the performance of Sh Rec3D+ and Chrom SDE methods on both data. The result of simulation data study shows that Sh Rec3D+ outperforms the others. Finally, Sh Rec3D+ also can well reconstruct 3D structure of genome for Hi-C data with different normalization methods. However, Input data with different correction methods affects the performance of Sh Rec3D+...
Keywords/Search Tags:chromosome structure prediction, Hi-C contact matrix, restraint-based models, Sh Rec3D+
PDF Full Text Request
Related items