Font Size: a A A

Study On Feature Expression And Fusion Algorithm In Multifunctional Enzyme Classification And Prediction

Posted on:2020-10-07Degree:MasterType:Thesis
Country:ChinaCandidate:G LiuFull Text:PDF
GTID:2370330575989045Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The multifunctional enzyme plays a very important role as a biological catalyst in various reactions of the organism.Correctly distinguishing the multifunctional enzyme can play a key role in gene engineering and cell engineering.The purpose of this paper is to predict the function of the multi-function enzyme by the multi-label classifier.Because the prediction of the multi-function enzyme is influenced by the data set,the expression of the feature and the selection of the classifier,this paper studies the function of the multi-function enzyme.The specific research work is as follows:(1)In the study of multi-functional enzymes,the data set constructed by predecessors has too high homology to predict low-homology multi-functional enzymes.Therefore,this paper constructs a data set of low-homology multi-functional enzymes to predict multi-functional enzymes(2)For the feature expression,the multi-evolution information PSSM matrix is proposed in this paper.The multi-evolution information PSSM matrix can express more evolution information of the multi-functional enzyme sequence with respect to the conventional PSSM matrix.The local feature extraction of two-dimensional Gabor transform is also presented in this paper.By means of Gabor transform,the PSSM matrix can be decomposed in multi-scale and multi-direction,so that more information can be obtained.Compared with the existing feature extraction method(DPC-PSSM)based on the PSM matrix,the method has certain advantages in the classification effect.feature expression based on amino acid sequence In this paper,a new method of feature extraction from dipeptide local words is proposed,which is better than that of AAC,AmPseAAC.(3)aiming at many kinds of feature information obtained in the experiment,the feature fusion is carried out in this paper.First?The feature extraction fusion method is used to fuse the features,and the recursive feature elimination method(RFE).Is used in feature selection.Secondly,the fused data is normalized and divided into redundancy.After processing,the fusion feature data can reach 92.21%,93.73%,91.11%,97.98%in recall rate,precision,F-value and average precision.(4)aiming at the multi-function enzyme classification prediction problem,the random K-tag ensemble classification algorithm is used in this paper,and the selection of the base classifier in the random K-label classification algorithm is discussed in detail.In this paper,four classifiers,(SVM),K nearest neighbor classification model,(KNN),Bayesian classification model,(NB),random forest classification model(RF),are used in the experiment.Through the cross-validation of 50%discount and the analysis of four evaluation indexes,it is found that the random forest is the best one for the performance of the base classifiers.Compared with other multi-label classification models,the model constructed in this paper can be retrieved.It has a good classification effect.
Keywords/Search Tags:multi-label learning, multifunction enzyme, PSSM matrix, Multiple Evolution Matrix, Feature Fusion
PDF Full Text Request
Related items