Font Size: a A A

Protein Tertiary Structure Prediction Based On Flexible Neural Tree And Its Integration

Posted on:2012-08-12Degree:MasterType:Thesis
Country:ChinaCandidate:X HuangFull Text:PDF
GTID:2120330335979740Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the emergence of structural genomics, through structure to predict the biological function become one of the main purposes of structural biology and bioinformatics. Protein function determined largely by its tertiary structure. Study the structure of the protein is significant, can help to understand the role of proteins and understand how to exercise their biological functions proteins, and understand protein-protein interactions, but also for biology, medicine and pharmacy are very important. Therefore, understanding the tertiary structure of proteins is a prerequisite for understanding its function.This paper introduced the protein tertiary structure, protein representation, flexible neural tree and the basic theory of integrated learning. Based on previous research, this paper used the flexible neural tree and its integration to predict protein tertiary structure. we used multi-expression programming algorithm to optimize the flexible neural tree structure, and used particle swarm optimization algorithm to optimize the parameters of the model。Ensemble learning used error correcting output codes. Also details on the basic principles of error correcting output codes and results decision-making methods. Clever convers multi classification problems into two categories, so as to achieve better prediction results. In this paper, using the flexible neural tree to predict the protein tertiary structure is divided into three stages: extract protein features, build the prediction model and integrated learning.(1) Protein features extraction. In order to facilitate better handling data for computer, we must first extract the features of protein, that is the process of conversing protein amino acid sequence into the input vector space, also known as the encoding process. Features selection is very important for the prediction, That the commonly used method of protein characteristics have amino acids (AA), peptides composed model, pseudo-amino acid composition (PseAA), hydrophobic mode, etc. The paper focuses on using PseAA as input features, and combine with other features. Experiments show that using pseudo amino acid composition and other characteristics of the integration achieved good prediction accuracy. (2) Build the prediction model. Flexible neural tree model overcomed the drawback of other nonlinear model that slow, difficult to adjust the network structure, with the following advantages: not pre-designed the input, the output, and the structure of network, flexible neural tree model can automatically design and optimize the network structure and parameters; the connection between the layers need not be completly,allowing the connection between the cross-layer; Evolutionary results of the flexible neural tree is usually structure simple and generalization effect than general neural network. This paper used the flexible neural tree as prediction model, in which multi-expression programming algorithm to optimize the flexible neural tree structure, the particle swarm optimization algorithm to optimize the parameters of the model.(3) In order to further improve the classification performance, finally we let the classifier to integrate. By using the database of C204 and 640 to test, the results show that integrate learning has greatly improved the final prediction accuracy.
Keywords/Search Tags:Protein Tertiary Structure, Feature Extraction, Pseudo-amino Acid Composition, Flexible Neural Tree, Integrated Learning
PDF Full Text Request
Related items