| Protein phosphorylation is one of the most important protein post-translationalmodification (PTM). As a result of the growing body of protein phosphorylation sitesdata, the number of phosphoprotein databases is constantly increasing, and dozens oftools are available for predicting protein phosphorylation sites to achieve fastautomatic results. Rice (Oryza sativa L.) is a representative model monocotyledo-neous (monocot) species, it shows an immense socio-economic impact on humancivilization. However, none of the existing SVM tools has been developed to predictprotein phosphorylation sites in rice.In this paper, we collected rice phosphorylation sites from recent literature,including Nakagami et al.(2010), and the feature table of Swiss-Prot database. Afterremoving the redundant phosphorylation sites, the number of phosphoserine,phosphothreonine and phosphotyrosine (Y) substrates were3806,547and129respectively, which were involved in2162proteins. The composition of k-spacedamino acid pairs (CKSAAP) was employed to analyze the sequences surrounding aquery site, and the large set of phosphorylation data was clustered by using Maximaldependence decomposition (MDD). A standard10-fold cross validation and LibSVMwere used to construct a new rice-specific SVM phosphorylation sites predictor,SVMphos_Rice. The result showed that its ACC of phosphoSerine,phosphoThreonine and phosphoTyrosine reached60.6%,56.9%and72.3%respectively, MCC reached0.295,0.167and0.484respectively. Compared to thenewly predictor(PlantPhos), the predicting performance on phosphoTyrosine,SVMphos_Rice achieved significant increase in ACC and MCC of17.3%and0.343respectively. Therefore, SVMphos_Rice can be served as a useful tool for proteinphosphorylation prediction in rice, it is freely accessible athttp://bioinformatics.fafu.edu.cn/SVMphos_Rice. |