Font Size: a A A

Research On Signal Peptide Prediction Algorithm In Nuclear Protein Sequences

Posted on:2021-07-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y GuoFull Text:PDF
GTID:2480306503471744Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
Nuclear localization signals(NLSs)are protein peptides binding to carrier proteins.They are continuous amino acid segments in protein sequences,which can transport nuclear proteins into the nucleus subcellular localization.As important information for nuclear localization,the identification of NLSs can help elucidate protein complex functions.Further,NLSs have become a main topic for the research and treatment of many diseases.However,the experimental identification of such signals is expensive,and currently only a limited number of NLSs have been identified.It is therefore important to develop prediction algorithms for NLSs.Although several automatic methods of nuclear localization signal prediction were proposed,they are usually specific for certain species or largely depend on the prior-knowledge of NLS basic residues.A more advanced predictor is highly desired to reduce the potential high false positives or high false negatives on discovering new NLSs.In this paper,I fuse the statistical knowledge with machine learning algorithm and large-scale frequent pattern data mining.Since the supervised machine learning algorithm needs experimentally verified NLS training dataset to construct the model and the number of validated NLSs is limited,it tends to find NLSs with already known rich basic residues.In order to make my prediction protocol to discover more unknown NLSs,I proposed to use the unsupervised BioPM algorithm to mine frequent nuclear motifs that are enriched in the nuclear protein sequences but sparse in the non-nuclear sequence database.These mined frequent pattern motifs will be considered together with the statistical knowledge and machine learning scoring outputs.My experimental results on the test datasets show that such a consensus scoring model improves the NLS prediction accuracy and capable of discovering new potential NLSs.
Keywords/Search Tags:nuclear localization signal prediction, machine learning, frequent pattern, data mining
PDF Full Text Request
Related items