| Human cytochrome P450s (CYP450s) are of central importance to drug interactions and interindividual variability in phase I drug metabolism. Human CYP450s take part in the metabolism of abundant marketed drugs and are responsible for major elimination pathways of drug clearance. Single nucleotide polymorphisms (SNPs) constitute a great proportion of genetic polymorphisms of human CYP450 and play crucial roles in causing individual and population differences in response to diseases, viruses, toxins, drugs etc. Among SNPs, non-synonymous SNPs (nsSNPs) can lead to amino acid mutations which may cause changes in the structure and function of proteins and high incidence of diseases. So we focus on the mutation of human CYP450 nsSNPs in protein sequences.Support vector machine (SVM), an effective statistical learning method, has been widely used in mutation prediction. But two factors, i.e., feature selection and parameter setting, have shown great influence on the efficiency and accuracy of SVM classification. However, most SVM methods now optimize parameters in the condition of feature fixing. In this study, according to the principles of genetic algorithm (GA) and SVM, we develop a GA-SVM program, optimizing features and parameters simultaneously. Hence fewer features are used and the overall prediction accuracy is improved.The GA-SVM program is applied to the mutation prediction of human CYP450 nsSNPs and succeeds in decreasing the capacity of feature subsets, from initial 147 features to 12 features. The final mutation predictive model also has a quite satisfactory performance, with the prediction accuracy of 61% and cross validation accuracy of 73%, better than many typical linear and non-linear classification models. Besides, we analyze the influences of physicochemical and structural properties in mutation prediction and presume that both kinds of properties should be considered when mutation predictive model is constructed. The results indicate that the GA-SVM program is a powerful tool in optimizing mutation predictive models of human CYP450s nsSNPs. Our study about the mutation of human CYP450 nsSNPs in protein sequences has the implication for the further research of human CYP450 cSNPs, such as discovering new nsSNPs, finding the differences between nsSNPs and synonymous SNPs in the process of drug metabolism and disease occurrence, etc. |