Font Size: a A A

Protein Function Prediction And Refinement Based On Manifold Learning

Posted on:2018-12-27Degree:MasterType:Thesis
Country:ChinaCandidate:H D LiangFull Text:PDF
GTID:2310330515483731Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of novel high-throughput experimental techniques,enormous amounts of protein data are collected in post-genomic era.However,the gap between the protein data and their functional annotations becomes increasingly wider.Even for the best-studied organism like the yeast,a quarter of its functions are still not sure.Therefore,it is highly desirable to develop an efficient method to achieve automatic annotation of protein functions from the computational perspective,which is one of the most challenging problems in the field of bioinformatics.In addition,current protein function annotation data obtained by high-throughput experimental methods or computational methods contain high false positive and false negative noises,which could degrade the applications based on the annotation of protein functions.In this thesis,we propose three effective computational methods to do protein function prediction and to solve the noise problem of protein functional data.Our methods are based on the topology of protein interaction network,manifold learning methods and graph theory.The main contributions of this thesis are listed as follows.(1)Concerning the automatic annotation of protein functions problem,a novel algorithm based on manifold learning and multi-label learning for protein function prediction is proposed.First,the protein-protein interaction network is weighted by edge-betweenness.The weighted network is then embedded into low-dimensional representation space by isometric feature mapping(ISOMAP)algorithm to obtain the low-dimensional features of proteins.Finally,protein function prediction is converted into a classic multi-label learning problem,and various multi-label learning methods can be adopted to achieve protein function prediction and evaluation.The experimental results show that the proposed method can produce more reasonable low-dimensional characteristics of proteins and more accurate prediction accuracy comparison to other alternative methods.(2)A robust algorithm based on multi-label linear regression of function correlation is proposed.First,protein-protein interaction network weighted by edge-betweenness is embedded into low-dimensional subspace using ISOMAP algorithm.Then,according to the distribution characteristics of low-dimensional features of proteins,the classic linear regression is extended to satisfy the multi-label scenario,and the cosine similarities between protein functions are calculated and integrated into the object function of multi-label linear regression model as a regularization item.At last,the effectiveness of the proposed method is evaluated using a yeast database.The experimental results demonstrate that the proposed method achieves more satisfactory prediction performance compared with other state of the art methods.(3)To solve the problem of noises in protein functional annotation data,a novel algorithm based on graph regularized l1-norm principal component analysis(Gl1PCA)algorithm is proposed for protein functional refinement.First of all,a protein graph and a function graph are constructed based on protein-protein interaction network and similarity matrix of protein functions respectively.Then,the protein and function graphs after Laplace change are integrated into the object function of l1PCA algorithm as regularization items.In the end,a fast solution based on the Augmented Lagrange Multiplier(ALM)is provided to the refinement model.The proposed method is verified by theoretical proof and optimal experiment.The experimental results indicate that the proposed method could effectively refine the protein functional annotation data.
Keywords/Search Tags:Protein-Protein Interaction Network, Manifold Learning, Multi-Label Learning, Graph Regularization, Functional Refinement
PDF Full Text Request
Related items