Font Size: a A A

Protein Remote Homology Detection And Folding Recognition Based On SCOP Topology

Posted on:2021-05-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y L ZhuFull Text:PDF
GTID:2370330611498852Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Protein fold recognition and remote homology detection are two basic problems in the field of bioinformatics.Accurate prediction of remote homologues and fold classes of proteins through sequence information of proteins plays an important role in the study of protein functional structures and the design of drugs.In this study,we treat the problem of protein sequence detection as a retrieval task,aiming to find protein sequences with known functional structures which are highly correlated with the query protein and infer the functional structure of it.The traditional methods of protein remote homology detection often have general performance for protein sequences with low sequence similarity.There are some machine learning methods to solve this problem,but their performances depend highly on the quality of features.Although the protein similarity network can further improve the results,it depends heavily on the performance of the basic ranking method.In order to solve the above problems.In this study,it improves the performance of the basic ranking results by incorporating all features into Learning to Rank method.Then,combining ranking results with protein similarity networks which base on SCOP topology.Finally,a predictor called Prot Dec-LTR4.0 was proposed.Rigorous tests on two widely used benchmark datasets showed that the strategy can effectively improve the performance of protein remote homology detection,and the predictive performance is better than any existing method.Although the ranking fusion strategy has been successful in solving the detection of proteins protein remote homology detection.However,fold recognition problems because of the low similarity of the sequence,resulting in the positive sample coverage performance of basic ranking methods is general.So,there is a problem which can cause features missing to happen.In order to solve this problem,this study uses the strategy of building protein feature vector and feature-filling based on SCOP topology relationship,and generates the global feature support vector machine by Learning to Rank method.Finally,a predictor called Fold-LTR-SVM was proposed.Testing on two widely used benchmark datasets showed that Fold-LTR-SVM performs better than the existing methods.Because of the low similarity of protein sequences in the field of protein fold recognition,the performance of current prediction methods is generally low.The performance of existing protein similarity network methods depends heavily on the performance of basic ranking methods.In this study,we propose a new method to construct protein similarity network based on SCOP topology structure and ternary closure principle.It replaces the traditional query-feedback protein pair similarity by calculating the global sequence similarity.In order to further improve the performance of the results,we use Learning to Rank method to fuse a large number of comparison alignment score features,and then combine them with the constructed protein similarity network.Finally,we propose a predictor named LTR-TCP-FR.Experiments show that the performance of LTR-TCP-FR is better than any existing fold recognition methods.
Keywords/Search Tags:protein remote homology detection, fold recognition, topological structure, learning to rank, protein similarity network
PDF Full Text Request
Related items