Font Size: a A A

A Machine Learning-based Investigation Into Important Determinants And Predictive Modelling Of Protease-specific Substrate Cleavage Targets

Posted on:2019-08-31Degree:MasterType:Thesis
Country:ChinaCandidate:Y N WangFull Text:PDF
GTID:2370330590967321Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Controlled at a systems level,regulation of proteolysis plays a critical role in a myriad of important cellular processes,such as blood coagulation,cell proliferation and apoptosis.The key to our better understanding of the mechanisms that control this process is to identify the natural substrates of the proteases.To address this,we develop PROSPER 2.0,a powerful bioinformatics tool for the accurate prediction of protease-specific substrates and their cleavage sites,which builds upon the success of our previously developed PROSPER tool.Importantly,PROSPER 2.0 represents a significantly advanced version of PROSPER,aimed at providing optimized cleavage site prediction models with better prediction performance and coverage for more species-specific proteases(up to 38 proteases).PROSPER2.0 integrates heterogeneous sequence and structural features derived from multiple levels in combination with a two-step feature selection procedure.Benchmarking experiments using cross-validation and independent tests show that PROSPER 2.0 is able to achieve a more competitive performance than several existing generic tools.We anticipate that PROSPER 2.0 will be a powerful tool for proteome-wide prediction of protease-specific substrates and their cleavage sites,and that it will facilitate hypothesis-driven novel substrate discovery.Otherwise,in order to improve the model performance of six proteases from Matrix Metalloproteases(MMPs),we propose a new knowledgetransfer computational framework by transferring the hidden shared knowledge from other MMP types for enhancing the substrate-cleavage sites predictions.Our computational framework uses support vector machines combined with transfer machine learning and feature selection.The results show that transfer-learning-based methods are more robust than traditional feature-selection methods for prediction of the six MMPs substrate-cleavage sites on the independent tests.The results also demonstrate that our proposed computational framework provides a useful alternative for the characterization of sequence-level determinants of MMP-substrate specificity.
Keywords/Search Tags:Proteases, Cleavage sites, Substrates, Protein, Transfer learning, Feature selection
PDF Full Text Request
Related items