Font Size: a A A

Analysis Of DNA Protein Binding Sites With DNase Deviation Signals

Posted on:2018-01-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y X SongFull Text:PDF
GTID:2310330542991407Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Since the starting of the ENCODE program,the investigation for functions of DNA and relevant biological information has never been stopped.The core of the genomic coding research is to analyze DNA binding sites,and understanding the interaction between DNA protein and binding site is key to analyze gene expression regulation.Comparing with Ch IP-Seq,DNase-Seq allows the detection of protein-binding sites across entire genome and the resolution can reach single base.In the study,the data from Ch IP-Seq and DNase-Seq of the same sample were obtained,and the exacted DNA protein binding sites checked with Ch IP-Seq were used to extract DNase-Seq information.In DNase-Seq preprocessing,it was found that the shear has tendency to cut in some base combinations,which result in the bias of signals.Then,a formula derivated algorithm was proposed to eliminate this bias.Using the DNase-Seq data with removing the bias,the signals in the binding site window are taken as positive data and the signals out of the binding site window as negative data.Finally,a SVM model was trained to predict binding sites of DNA proteins.During the validation,after filtering out the bias by the proposed model,it can be seen that the model get better prediction accuracy than the model trained with untreated data.
Keywords/Search Tags:DNA protein binding site, DNase-Seq technique, Ch IP-Seq technique, Deviation signal, Prediction model
PDF Full Text Request
Related items