Font Size: a A A

Prediction Of Amidation Sites Based On Ensemble Learning

Posted on:2019-03-02Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhouFull Text:PDF
GTID:2370330626952411Subject:Computer technology
Abstract/Summary:PDF Full Text Request
After protein synthesis,many neuropeptides and peptide hormones need to amidate their carboxyl termini to obtain complete biological activity.Amidation plays an important role in various pathological processes of organisms,so it is of great significance for its research.Research using traditional biological experimental methods such as bio-mass spectrometry has the disadvantages of labor intensive,long time and high cost.The common machine learning algorithm is not perfect for the feature space,so this paper proposes an integrated learning algorithm.The improvement of the effect on the amidation site has been improved,and the method has been improved compared with the previous method.In this paper,an integrated learning algorithm,stacking algorithm,is proposed to conduct experiments.The high-quality index,amino acid position-specific tendency,and K-interval amino acid are combined with the features obtained by the three feature extraction methods.After the feature selection,the support vector machine,decision tree,and naive Bayesian model are respectively trained.K-spaced amino acid composition and amino acid factor were selected to train the corresponding optimal support vector machine models.After the above experiments,five models were obtained.The five models were used as the base model using the stacking algorithm.Verify that a 5-dimensional feature is obtained to train a logistic regression model.I ended up with a model with good generalization capabilities.This method not only can use many types of feature information,but also through different types of classification algorithms,the feature space misclassified by different classifiers is corrected by other classifiers,and finally the best results are obtained.Finally,the model achieved good results on independent test sets with a sensitivity of 93.3%,a specificity of 97.8%,and an accuracy of 96.9%.The experimental results show that the proposed method has a better improvement than other methods.
Keywords/Search Tags:Amidation, Feature extraction, Feature selection, Ensemble learning
PDF Full Text Request
Related items