Font Size: a A A

The Research Of Subcellular Localization Prediction Based On Discrete Features

Posted on:2013-10-07Degree:MasterType:Thesis
Country:ChinaCandidate:L HuangFull Text:PDF
GTID:2230330395984827Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Proteins’ function, structure and interactions are closely related with subcellularlocalization, therefore, a reliable and effective protein subcellular localizationpredictor is very important for human to understand protein and design new medicine.Nowadays, mathematical method and computer technology based predictor hasbecome the significance of research. Although many predictors have been proposed inthe past several years, there are still two main problems need to be solved. First, manypredictors are with high time complexity and space complexity. Second, majoritypredictors cannot be applied to the dataset with limited data and extremely unbalanceddistribution. Aim to solving these two problems, we proposed two novel predictorsbased on discrete features, and the innovative work has been summarized as follows.For the first problem, this paper proposed a subcellular localization predictorbased on a novel protein graphical representation HR-Curve. HR-Curve is based onamino acid classifications and dual vector, and has many remarkable advantages suchas high visibility, information completeness, classification visibility,multi-application and so forth. Meanwhile, we also proposed a simple but efficientEuler distance based computing method, which largely decreased the time complexityand space complexity. Then, we applied the HR-Curve predictor to subcellularlocalization prediction, and the experimental results demonstrated HR-Curvepredictor can get a high prediction accuracy as well as high efficiency.As for the second problem, this paper proposed a SVM transfer predictor. Thetransfer predictor is based on hydrophilic-hydrophobic amino acid classificationfeature extracting method and transfer learning idea. By introducing self-consistencychecking condition, the predictor can train data efficiently and predict subcellularlocalization with a high accuracy. Finally, we demonstrated the advantages of thetransfer predictor from two perspectives. On one hand, we demonstrated theapplicability and efficiency of transfer predictor through designing controlexperiments. On the other, we compared the transfer predictor with other predictors,which further demonstrated our method’s advantage and prospect.
Keywords/Search Tags:Protein graphical Representation, Subcellular Localization Prediction, Feature Extraction, Support Machine Learning, Transfer Learning
PDF Full Text Request
Related items