Font Size: a A A

Research On Prediction Method Of Protein Tertiary Structure Based On Recurrence Quantitative Analysis And Horizontal Visibility Graph

Posted on:2022-08-30Degree:MasterType:Thesis
Country:ChinaCandidate:H JiangFull Text:PDF
GTID:2480306347973089Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Protein is an important component of human life.The study of protein structure is conducive to the treatment of major medical diseases.Nowadays,the number of unknown proteins is increasing rapidly,but the number of proteins with known structure is increasing slowly.Therefore,the research of protein tertiary structure prediction methods has important theoretical significance and application prospects.As an important research field in modern digital signal processing,biological signal analysis and processing has important research significance.Due to the complexity of life,biological signals are usually nonlinear,chaotic and nonstationary.The nonlinear time series analysis method based on complex network provides a new idea and research direction for nonlinear time series analysis.The method of transforming time series into complex networks opens a new door for nonlinear time series analysis.Nonlinear time series analysis method based on complex network has been widely used in many research fields.In this paper,a protein tertiary structure prediction method based on recurrence quantitative analysis and horizontal visibility graph is studied.The protein sequence is represented as recurrence plot,and the feature extraction method of protein tertiary structure based on recurrence quantitative analysis is studied.The limited penetrable weighted visibility graph method is studied.The protein sequence is constructed as a complex network,and a feature extraction method of protein tertiary structure based on topological statistical characteristics of complex network is proposed.Combined with the classification model,the prediction accuracy of protein tertiary structure was improved.The main methods and innovations are as follows:Based on the multiscale time series analysis method,the transformation method from protein sequence to time series is improved.The amino acid sequence of protein is transformed into two time series by chaos game representation,which hides the topological information of protein.Based on the method of multiscale time series analysis,the coarse-grained analysis of protein time series enriches the topological information and enhances the recursion of time series.The method of protein tertiary structure prediction based on multi-scale recurrence quantitative analysis and horizontal visibility graph was studied.Recurrence quantitative analysis reflects the recursion of time series,and horizontal visibility graph feature extraction method reflects the topological characteristics of time series,one represents local information and the other represents global information.In this paper,we combine the two feature extraction methods and propose two new feature extraction methods: Laplacian energy and Laplacian energy to describe the local features and global information of protein sequence.The prediction results of the improved method for 25 PDB and 1189 protein datasets are95.33% and 93% respectively.The experimental results show that the combination of recurrence quantitative analysis and horizontal visibility graph feature extraction method can improve the prediction effect of protein.The prediction method of protein tertiary structure based on limited penetrable weighted horizontal visibility is studied.Traditional horizontal visibility graph is a combination of points and edges,but protein sequences are of different lengths.The construction of horizontal visibility graph for short-length protein sequences tends to cause too sparse number of edge connections,which makes it impossible to extract effective topological information from them.The traditional horizontal visibility graph is a binary network,and there is no weight information between edge connections.This paper proposes the improvement methods that can be tried at the traditional horizontal visibility graph: According to the characteristics of limited penetrable horizontal visibility graph and weighted horizontal visibility graph,a complex network construction method based on limited penetrable weighted horizontal visibility graph is proposed.A new feature extraction method for the tertiary structure of protein sequences based on the two features of clustering coefficient entropy and average node strength is proposed.According to the improved method,the extracted new features are put into the support vector machine of the classifier,and the prediction results of the protein tertiary structure are obtained,which reached 97.67% and 91.99% on the 25 PDB and 1189 datasets,respectively.The experimental results prove that the improvement of the horizontal visibility graph in this paper can greatly improve the prediction of protein structure.
Keywords/Search Tags:Prediction of protein tertiary structure, Limited penetrable horizontal visibility graph, Weighted horizontal visibility graph, Recurrence quantification analysis
PDF Full Text Request
Related items