Font Size: a A A

Predicting Protein Subcellular Location Based On Principal Component Analysis

Posted on:2011-08-22Degree:MasterType:Thesis
Country:ChinaCandidate:D ShiFull Text:PDF
GTID:2120330332460917Subject:Detection technology and automation equipment
Abstract/Summary:PDF Full Text Request
In recent years, we have entered post genome era from genome era after the accomplishment of Human gene mapping. One of the most remarkable features in this age is the explosive increase in volume of protein sequence data. The location of a protein in a cell is closely correlated with its biological function. With the avalanche of protein sequences generated in the post-genomic age, it is highly desired to develop automated methods for efficiently identifying various attributes of uncharacterized proteins.To realize that, one of the keys is to find an effective model to represent the sample of a protein. Thus, various non-sequential models or discrete models are proposed. The simplest discrete model is the amino acid composition. Using it to represent a protein, however, all the sequence-order information would be completely lost. To cope with such a dilemma, the concept of pseudo amino acid composition is introduced by Kuo-Chen Chou. Its essence is to keep using a discrete model to represent a protein yet without completely losing its sequence-order information.However, it is difficult to determine in advance the optimal value ofλwhich reflects sequence order effects. In view of this, a data analysis method, principal component analysis (PCA), is applied to deal with the problem. Experimental results show that our method provides superior prediction performance for predicting protein sub-cellular location.
Keywords/Search Tags:Bioinformatics, Protein Subcellular Location, Pseudo Amino Acid Composition, Principal Component Analysis, Serial Correlation Factor
PDF Full Text Request
Related items