Font Size: a A A

Feature Description And Recognition Algorithm Of Eukaryotic Gene Splice Site

Posted on:2015-01-04Degree:MasterType:Thesis
Country:ChinaCandidate:D J ZhengFull Text:PDF
GTID:2310330518488422Subject:Biological Information Science and Technology
Abstract/Summary:PDF Full Text Request
Gene splicing is an important process of gene expression,which affects protein translation giving influences to life movement.The identification of splice site plays an important role in gene discovery and gene structure determination,so it becomes a major concern of bioinformatics.Compared with traditional methods,computational methods not only reduce expense of experiment but also improve the efficiency with significant convenience.Following a survey on recognition algorithms of splice site,the paper proposes improvements on feature description and identification algorithms.The paper focuses mainly on the methods of predicting human gene splice site,which includes the followings:(1)With the concept of information entropy,calculate the uncertainty diminution of each site in gene splice sequence,which is defined as information quantity of sites.The information quantity is then used for establishing a Bayesian model for predicting splice site of human beings.(2)Splice site is located on joint of exon and intron.Considering the interior relations among exon/intron and intron/exon,a more complex model of digraph,Hidden Markov Model,is applied.(3)A support vector model(SVM)is established for evaluating different feature extraction methods on predicting human gene splice site.(4)The impacts of different feature extraction methods on predicting human gene splice site is evaluated.The paper searches primitive features,features based on information quantity,statistics features,PCA-based features and KPCA-based features.The calculation results based on Homo Sapiens Splice Sites Dataset indicate that features based on information quantity,PCA and KPCA further improve prediction accuracy of human gene splice site.
Keywords/Search Tags:splice, information quantity, feature extraction, SVM, KPCA
PDF Full Text Request
Related items