Font Size: a A A

Systematic Analysis And Comparison Of Amino Acid Descriptors In Bioactive Peptide Qsars

Posted on:2021-03-15Degree:MasterType:Thesis
Country:ChinaCandidate:Q Q MiaoFull Text:PDF
GTID:2370330611955143Subject:Biophysics
Abstract/Summary:PDF Full Text Request
Bioactive peptides are playing an important role in cell biology.We want to study the quantitative relationship between the structure/function of biologically active peptides by the computer-aided drug method,this is known as peptide QSAR,It is also one of the hot spots in the development of new biological peptide drugs.Amino acid descriptors are widely used in peptide quantitative structure-activity relationship research.They are used as the basic"module"of peptides and proteins,that is,amino acids,as the analysis unit,to characterize the sequence and structural characteristics of the entire peptide.And they also use machine learning and statistical modeling to correlate it with its biological activity or efficacy,to get a quantitative functional relationship between the structure and activity of the peptide.Our systematic analysis shows that in the past few decades,researchers have proposed dozens of amino acid descriptors.As new types of amino acid descriptors continue to be proposed,relevant researchers had a lot of confusion and randomness in using and selecting these complicated amino acid descriptors.In view of this,we collected and sorted the existing mainstream amino acid descriptors in a comprehensive and systematic manner,and applied it to the quantitative structure-activity relationship analysis and comparison of various biologically active peptides.The main work includes the following:From the previous reports,33 kinds of amino acid descriptors were collected.They can be roughly divided into physical and chemical properties,topological properties,quantitative properties,and comprehensive properties.It was applied to the systematic quantitative structure-activity relationship research of 5 classic biologically active peptide sets.In this process,we adopted 2 types of linear machine learning methods(including MLR and PLS)and 4 types of nonlinear machine learning methods(Including SVM,LSSVM,RF,GP)to carry out the statistical modeling work,and constructed a total of 990 quantitative structure-activity relationship models(33descriptors×5 active peptide sets×6 categories and machine learning methods),In addition,we also carried out internal crossover Verification,external blind verification,and strict Monte Carlo cross-validation(MCCV)which providing an in-depth analysis of the statistical performance of the model.Further we conducted a systematic comparative study on the obtained statistics(R~2,RMSEE,Rcv2,RMSCV,Rpred~2,Qest~2,RMSEP),the results show that:BTD-PLS-V?BTD-SVM-GH-scales?BTD-GP-VHSE?BTD-GP-PCPS?BTD-LSSVM-V?BTD-LSSVM-G-scales?BTD-LSSVM-FASGAI?BTD-LSSVM-VHSE?BTD-LSSVM-ISA-ECI?BTD-PLS-SSIA-AM1?BTD-LSSSVM-SSIA-PM3 have better modeling effects than other models.We further adopted the principal component analysis(PCA)method to compress and extract a large number of the original(first-level)amino acid descriptors collected,and obtained a new type of comprehensive(second-level)amino acid descriptors,called it VGSV.We also used it to model the quantitative structure-activity relationship of 5groups of biologically active peptide data sets and compared with the above-mentioned primary amino acid descriptor modeling results.We collected and organized a large number of amino acid descriptors published by predecessors under a unified framework,which can be used as a basic amino acid descriptor database.By applying these descriptors to a series of classic bioactive peptide data sets,we used mainstream machine learning methods to carry out systematic quantitative structure-activity comparison studies,the knowledge gained is to further seek for descriptor-peptide type-machine learning methods.The optimization of rules and suitable conditions provides useful help and a standard reference for future generations of amino acid descriptor development and peptide quantitative structure-activity relationship research.In addition,we proposed a new secondary descriptor,which almost all the information of traditional amino acid descriptors is covered,and it also shows that good performance in the subsequent test research.Therefore,it can be regarded as a universal standard descriptor for covering biologically active peptides and proteins with diverse functions,In addition,it also includes drug design and biological information research related to it.
Keywords/Search Tags:Amino acid descriptors, biologically active peptides, statistical modeling, machine learning, comparative studies, secondary descriptors, quantitative structure-activity relationships
PDF Full Text Request
Related items