Font Size: a A A

Research On Identification Of Promoters' Type Based On Next-Generation Sequcencing Data

Posted on:2018-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:L K JiangFull Text:PDF
GTID:2310330536481934Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The research of the human whole-genome has been entered in the postgenomic era,which is an era for decoding,interpreting and developing the genome function as the core research contents.With the great development of sequencing technology,the identification of the gene production and the phenotypic function has been entered a new stage of “large-scale and high-throughput”.The research of the regulation of gene expression has been a hotspot in genomics.The identification of promoters' type has been the key to further understand of complex regulatory mechanism in the human genome and promoter is a key element in the network of the regulation of gene expression.In this paper,we first preprocessed the ref Gene data and named the data we got as single Gene.Then we calculated and analyzed five cell lines'(Hepg2,Huvec,Gm12878,K562 and H1hesc)gene expression level(RPKM).Then we used the data of Pol ? which was acquired from Ch IP-seq and gene RPKM to identify active promoter and poised promoter according to the characteristic of RNA polymerase ? enriching in the regions of promoter.After that,we analyzed alternative promoter in multiple cell lines.Finally,we selected upstream and downstream region of the gene transcription start site which was totally 2kbp long as candidate region.Then we divided it into 10 continuous and no-overlapped bins to study the distribution of histone modification signal in these bins of H1 hesc,Huvec and Gm12878.We also analyzed the specificity of histone modification's distribution in different types of the promoter.Using the feature data of H1hesc's histone modification as training data,we trained classifier model to identify and predict candidate promoters' type in cell lines of Huvec and Gm12878 based on machine learning algorithm.
Keywords/Search Tags:promoter, histone modification, gene expression, RNA polymerase ?
PDF Full Text Request
Related items