Font Size: a A A

Research On Protein Subcellular Localization Prediction Based On Evolutionary Information And Feature Fusion

Posted on:2022-05-28Degree:MasterType:Thesis
Country:ChinaCandidate:L DuFull Text:PDF
GTID:2480306347473094Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
The development of sequenceing technology has provided abundant raw materials for human exploration of life,but at the same time,it has also brought a bigger challenge:traditional experimental methods have become increasingly inadequate in the face of such a huge amount of biological data.The rapid development of bioinformatics theory provodes support for biological data mining.Through the combination of biological theory and computer science,a large amount of biological data information has been excavated,among which the reaseach on the subcellular localization of proteins is an important direction.Based on the bioinformatics theory,this paper focuses on the research of protein subcellular localization prediction with evolutionary information and feature fusion as the center.The main work is summarized as follows:(1)Based on the protein evolutionary information and LDA,a new method for predictiong the subcellular location of apoptosis proteins is proposed.Firstly,two evolutionary information-based features are proposed based on the position specific scoring matrix(PSSM)of aopotosis proteins: CTM and AECA-PSSM.CTM extractes from the consensus sequence transformed through the PSSM,which combines the evolutionary information with the transfer distribution information of amino acids;AECA-PSSM uses absolute entropy correlation analysis method to directly extract the correlation inforlation between any two amino acid properties in PSSM.Then after the fusion of the two features,linear discriminant analysis(LDA)is applied to eliminate the redundancy and noise in the fusion features.Finally,SVM is used to predict the subcellular location of proteins and the proposed method is validated on the two classical apoptotic protein datasets,with classification accuracy of 99.7% and 95.6% on CL317 and ZW225 dadasets,respectively.(2)A multi-site protein localization method based on evolutionary information and MLDA is proposed.Firstly,two novel features based on evolutionary information are fused:CED and PSSM-DWT.In the CED feature,the association information between amino acid residues at each position in the consensus sequence and the amino acid residues at a certain position after it is considered by introducing the distance parameter,which reflects the local and global sequence information of the protein at the same time.In the feature of PSSM-DWT,considering the non-stationary characteristics of PSSM signal,the time-frequency discribution information is extracted by means of discrete wavelet transform.Then,multi-label linear discriminant analysis(MLDA)is used to solve the noise problem in multi-label learning.Finally,the features with reduced dimensions are fed into the ML-RBF classifier for prediction.The experimental results show that compared with other GO feature based methods,the evolutionary information in this paper could also obtain good location prediction results.(3)A deep fusion evolutionary information model is proposed to predict protein subcellular sites.Based on the different evolutionary features of proteins extracted by methods proposed in this paper,a model based on deep fusion evolutionary information is constructed to predict the subcellular locations.Three parallel multi-scale dilation convolution modules are used to extract the deeper representational information of different feature views,and during feature fusion,the attention mechanism is used to obtain the correlation between different features and the key information that plays a role in subcellular localization.Results from Gram-positive and Gram-negative bacterial rotein datasets show that the proposed method can sutomatically learn and extract more effective feature information and improve the subcellular location recognition rate.
Keywords/Search Tags:subcellular localization, evolutionary information, feature extraction, feature fusion, dimensionality reduction algorithms, deep learning
PDF Full Text Request
Related items