Font Size: a A A

The Study On Protein Classification Based On Pseudo Amino Acid

Posted on:2012-01-24Degree:MasterType:Thesis
Country:ChinaCandidate:C C XiaoFull Text:PDF
GTID:2120330335451564Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Understanding of biological data contains a biological significance has become a extremely important topic in the post-genome era, bioinformatics role will increasingly important. Facing huge amounts of protein sequence data, introduce intelligence algorithm carries on the processing, which is important to study the protein structure and function .Owing to the protein structure and function is highly complexity. using the method of commonly used experiment make it difficult to get its three-dimensional structure of some proteins(such as difficult to crystallization or giant molecular protein). And the disadvantage of the experiment method is high cost, very time-consuming,more and more concern to simulate by computer design relevant algorithm by researchers in recent years.This paper proposes a new kind visual method of protein sequence, compared with other methods in the standard datasets and verified its validity. This paper main innovation place summarized as follows:(1) This paper proposes a new protein sequence visual methods--distance matrix image. Use the hydrophobic values of amino acids and the hydrophilic values of amino acids and the side-chain mass values of amino acids as space coordinates of the protein sequences, Through the space coordinates calculation the between distance of each amino acid sequence ,and the distance matrix as a texture image. In this way, each matrix element is corresponded for an image pixels, the values corresponded the gray value, This distance matrix image can reflect the general characteristics of protein sequences.(2) Constructed a new kind of pseudo amino acid composition base on geometric moments of distance matrix image. Which can be a very good reflect a protein sequence features.(3) Designed multiple classification predictors of protein sequences base on the proposed distance matrix image.such as human papilloma virus risk types predictor, protein secondary structure types predictor, G protein coupled receptors types predictor, the success rate of these predictors than the success rate of existing predictors is higher .(4) Because of amino acid composition method's shortcomings, we constructed a decimal number coding model base on amino acid digital coding model , through to the example of nuclear receptor classification showed that its results above amino acid composition method.
Keywords/Search Tags:bioinformatics, protein classification, distance matrix image, feature extraction, pseudo amino acid composition, digital coding model
PDF Full Text Request
Related items