Font Size: a A A

Predicting Protein Subchloroplast Locations Based On Fusion Features

Posted on:2015-03-22Degree:MasterType:Thesis
Country:ChinaCandidate:X Y XiangFull Text:PDF
GTID:2180330467962689Subject:Physics
Abstract/Summary:PDF Full Text Request
Chloroplasts are organelles found in cells of green plants and eukaryotic algae. Chloroplasts play important functional roles in many biological processes such as photosynthesis and cellular metabolism. Subchloroplast location is a deeper study than subcellular location. Knowing a protein’s subchloroplast location provides in-depth insights about the protein’s function and the microenvironment where it interacts with other molecules,In this study, based on the previously constructed database of chloroplast, an approach for predicting protein subchloroplast locations is proposed by combining the amino acid composition, average chemical shift, gene ontology annotation, evolutionary information, and tripeptides. We selected the best feature set of tripepties by the binominal distribution which can achieve the preferable accuracy. The overall prediction accuracy is90.2%for the subchloroplast location by using the algorithm of the support vector machine in Jackknife test. For the independent test and cross-validation, the good prediction results were also obtained. By comparing with other methods, our method showed excellent predictive performance.In this paper, the datasets of A.thaliana subchloroplast localization proteins were constructed. The gene ontology annotation and evolutionary information were chosen as the represent A.thaliana chloroplast proteins. We used an ensemble classifier of KNN and SVM algorithms to predict the subchloroplast localization of Athaliana proteins based on a voting system. The overall prediction accuracy by Jackknife test is79.7%for this datasets.
Keywords/Search Tags:subchloroplast location, PSSM, chemical shift, gene ontology, tripeptides, binomial distribution, SVM, SVM-KNN classification
PDF Full Text Request
Related items