Font Size: a A A

Protein Classification Based On Neural Network

Posted on:2020-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:G QiuFull Text:PDF
GTID:2370330623956706Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Protein is considered to be an essential element in life and has various functions to sustain life,which makes proteomics a very important research field in modern bioinformatics.Since proteins can be classified into different categories according to their functions,and proteins of the same class have similar structures,and they also have similar properties,it is important to study the classification of proteins to determine their functions.With the development of biotechnology,a large number of proteins have been discovered,and only a small part of them have been experimentally analyzed to determine their structure and corresponding biological functions.For rapid growth of protein data,the experimental method requires a lot of labor and time.Therefore,it is becoming more and more important to classify proteins and study their functions through computational techniques to better understand the theory behind the life cycle.Today,machine learning and neural network technology are widely used in bioinformatics problems,which use learning methods to extract knowledge from a large amount of data,and then analyze the laws behind it.And in many problems,these data can be naturally represented by discrete structures of graphs,networks,trees,or sequences.In this paper,we use protein as the research object,transform the protein into a graph structure model,extract the features of the protein graph structure by the proposed VES(Vertex Edge Similarity)graph kernel function,and combine DNN(Deep Neural Networks)to construct a VES-DNN protein classification model.The experimental results show that the classification effect of the VES-DNN model is better than other graph kernels.In addition,based on this,this paper uses multi-kernel for ensemble learning,and proposes MultiKernel-Stacking(Multiple Kernel Stacking)protein classification model.It can be obtained from the experimental results that the classification model is superior to the VES-DNN model.The main research contents of this paper are as follows:1.Proposed VES graph kernel function.First,each row in the weighted adjacency matrix of the graph is used as the vector of the corresponding vertex,the similarity of the two graphs is measured by comparing the similarity of the vertex vectors in the two graphs,and the kernel values are determined according to the maximum similarity of the vertices of the two graphs.2.A VES-DNN protein classification model based on VES graph kernel function was proposed.According to the VES graph kernel function,the kernel matrix of the protein graph structure samples is obtained,and each row in the kernel matrix is used as the input feature vector of the neural network to obtain the classification result.The experimental results show that the model can effectively improve the classification of proteins.3.The MultiKernel-Stacking protein classification model was proposed.The model uses the Stacking ensemble learning method to use the vector composed of the VES-DNN model classification results of multiple graph kernel functions as the input of the neural network,and obtains the classification result of the MultiKernel-Stacking model.By analyzing the experimental results and comparing with the VES-DNN model,the model further improved the classification effect of the protein.
Keywords/Search Tags:protein classification, graph kernel, neural network, ensemble learning
PDF Full Text Request
Related items