Font Size: a A A

The Research Of Prediction For Post-translational Modification Sites And Drug Indications

Posted on:2016-03-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:G H HuangFull Text:PDF
GTID:1224330479495623Subject:Bioinformatics and systems biology
Abstract/Summary:PDF Full Text Request
Protein post-translational modifications is an important regulating mechanism in a cell, and types of membrane proteins are closely associated with its functions. Therefore, accurately identifying modified sites of proteins and types of membrane proteins attaches importance to disease prevention and treatment. Drug repositioning is emerging as an important direction of discovering new use of drugs and currently becomes a hot topic in the area of computational biology. This thesis explored computational methods for predicting protein S-nitrosylation and carbamylation sites, membrane protein type and new drug indications. Main contributions were described as follows:1. Prediction of protein S-nitrosylation sitesOn the basis of the sparse representation, the kernel functions and feature selection, a framework to computationally predict S-nitrosylation sites was presented. As many as 666 features derived from six categories including amino acid properties, secondary structure and frequencies are first used for numerical representation of proteins. Then, both the minimum redundancy maximum relevance algorithm and the kernel sparse representation classification were employed to select optimal features. Finally, the kernel sparse representation classification was used to construct the predicting model. Computational results showed that our predictor achieved Matthews’ correlation coefficient(MCC) of 0.1634 and 0.2919 for ten- fold cross validation on the training set and for independent test, respectively. For comparison with other predicting method, we constructed an independent testing set consisting of 113 protein sequences. Experimental result showed that our predictor also yielded good performance with MCC of 0.2239, outperforming other two methods: iSNO-AAPair and iSNO-PseAAC, whose MCC were 0.1125 and 0.1190, respectively. Besides, A web tool for predicting protein S- nitrosylation sites were developed at: http://www.zhni.net/snopred/index.html.2. Identification of protein carbamylation sitewe have for the first time presented a computational framework for theoretically predicting and analyzing carbamylated lysine sites based on both the one-class k-nearest neighbor method and two-stage feature selection. The one-class k-nearest neighbor method requires no negative samples in the process of training. Experimental results showed that by using 280 optimal features the presented method achieved the promising performances with the sensitivity of 82.50% for the leave-one-out test on the training set, and with the sensitivity of 66.67%, the specificity of 100.00% and the MCC of 0.8097 for the independent test on the testing set, respectively. Further analysis of the optimal features provided some insights into the mechanism of action of carbamylated lysine sites.3. Prediction of multi- label type of membrane protein in humanWe explored relationship between homology as well as interactions among proteins and proteins’ types, and proposed an integrated approach to predict multi- label types of membrane proteins by employing sequence homology and protein-protein interaction network. As a result, the prediction accuracies reached 87.65%, 81.39% and 70.79%, respectively, by the leave-one-out test on three datasets. It outperformed the nearest neighbor algorithm adopting pseudo amino acid composition. In addition, a new metrics used for evaluating performances of dealing with multi- label problems was presented.4. Prediction of drug indicationsWe explored relationships between chemical-chemical interactions as well as structural similarities and indication, and presented a method to computationally predict drug indications. The chemical-chemical interaction prior to structure similarities is used to predict indications. If the query drug would not interact with the training drugs, the structural similarity is employed. Five times 5-fold cross-validation in the training set comprising 1,573 drugs yielded the average accuracy of 51.48% on the five 1st order predictions. Meanwhile, the model yielded an accuracy rate of 50.00% for the 1st order prediction by independent test on a dataset with 32 other drugs in which drug repositioning has been confirmed. Interestingly, some clinically repurposed drug indications that were not included in the dataset are successfully identified by our method.5. Prediction of cancer drugBased on chemical-chemical interactions, a computational method for predicting cancers’ drug was presented. The order from the most likely cancer to the least one was obtained for each query drug. The 1st order prediction accuracy of the training dataset was 55.93%, evaluated by leave-one-out test, while it was 55.56% and 59.09% on a validation test dataset and an independent test dataset, respectively. The proposed method outperformed a popular method based on molecular descriptors. Moreover, it was verified that some drugs were effective to the ‘wrong’ predicted indications, indicating that some ‘wrong’ drug indications were actually potential indications. The promising results indicate the method may become a useful tool to the prediction of drugs indications.
Keywords/Search Tags:post-translational modifications, drug repositioning, sparse representation, one-class k-nearest neighbor, protein S-nitrosylation, protein carbamylation, membrane protein type
PDF Full Text Request
Related items