Font Size: a A A

The Application Of Absolute Frequency In Nucleosome’Identification And Prediction

Posted on:2013-07-02Degree:MasterType:Thesis
Country:ChinaCandidate:Z Q ZhangFull Text:PDF
GTID:2234330374483371Subject:Operational Research and Cybernetics
Abstract/Summary:PDF Full Text Request
This paper uses the nucleosome dinucleotide absolute frequency to study nucleosomes’identify and predict. In the chromatin of eukaryotic cells, the nucleosome is a major constituent units. Nucleosome positioning is determination of the precise location of nucleosome in the genome sequence. Nucleosome positioning in the genome has a very important role for understanding the transcription factor binding,transcriptional regulatory mechanisms and a variety of biological processes. Recent studies have shown that nucleosome positioning played an important role in DNA replication, repair, alternative splicing, gene transcription regulation and such basic life processes, even in DNA sequences and the evolution of gene expression regulation it also has a significant impact. In recent years because of the CHIP-CHIP and the CHIP-seq high-throughput technologies emergence and development, the study of nucleosome positioning reached a climax, and has made some progress and achievements.This article will attempt to create a new nucleosome positioning model on the basis of the existing identification and positioning method. Here, we introduce the nucleosome dinucleotide absolute frequency, and use it to quantify every nucleosome. which made using mathematical methods to deal with the nucleosome data become possible. Then we improved a distance discriminant method to identify and predict nucleosomes.At last,we test the model through the existing data set, which verify the new model was validity and feasibility. This paper has the following aspects achievements:First. Introduce the concept of the nucleosome dinucleotide absolute frequency: Unlike the past article, when considering the sequence dependence of nucleosome. the article did not use the dinucleotide frequency, triucleotide frequency and other traditional statistics, but use nucleosome dinucleotide absolute frequency (absolute frequency), resulting in a more concise vector.Second. we find a simple calculation method to calculate the similarity analysis of nucleosome, which greatly simplifies the computational complexity and reduce the calculation of large nucleosome dataset.Third, combined use distance and machine learning methods to predict nucleosome position:by calculating the distance and using machine learning methods, we set up a nucleosome positioning model. The we made a verification of nucleosome positioning in Saccharomyces cerevisiae chromosome and got very satisfy accuracy.Four, by the analysis of nucleosomes predict results, we summed up the scope and limitations of the new model.We use the nucleosome dinucleotide absolute frequency can get a more accurate results, but the factors which influence nucleosome positioning are very much, such as the dependence of the DNA sequence, competition and cooperation of the protein molecule, ATP-dependent remodeling. If we can increase these factors in our vector, such as periodicity and curvature, the results may be better. In addition, the dependence of nucleosome dinucleotide absolute frequency are different betwwen different species, which make the prediction results different, so the accuracy is subject to experimental verification, the scope of application of the new method needs to be further study.
Keywords/Search Tags:nucleosome positioning, dinucleotide absolute frequency, sequencedistance, machine learning
PDF Full Text Request
Related items