Ovarian cancer is a critical cause of cancer death among women and has the highest mortality rate worldwide, so it seriously threatens to the women’ health and safety. Early detection is a crucial step for cancer treatment. Recently, proteomics is a new technique for early diagnosis of cancer and surfaced-enhanced laser desorption-ionization-time of flight mass spectrometry(SELDI-TOF-MS) technology has been treated as an early screening technique for detecting cancer. However, the data obtained by the SELDI-TOF-MS technique are complex, high-dimensional and redundant. In order to detect ovarian cancer accurately, we presented a method combined probabilistic principal components analysis(PPCA) with support vector machine(SVM) for analyzing SELDI-TOF-MS data from clinical proteomic studies. Firstly, fuse preprocessing methods to eliminate baseline drifts and noises. Then use PPCA technology to process high-dimensional mass spectrometry data for the feature extraction and optimization, dimension reduction. Finally, we randomly select 70% from 216 MS data set as a learning set to establish the SVM model and optimize the SVM model parameters by a grid search method, and use the remaining 66 data set as a testing set for prediction and verification. Recognition rates and predictive rates are used to evaluate the classification performance of model, respectively. To verify the classification performance of PPCA-SVM model further, we compared the proposed model with BP neural networks and PCA-SVM model. The predictive rate of PCA-BP is 81.80%. The predictive rate, sensitivity and specificity of PCA-SVM are 82.26%, 82.29% and 82.25%,the predictive rate, sensitivity and specificity of PPCA-SVM is 89.81%, 90.45% and 88.00%. Experimental results show that the proposed PPCA-SVM model is an effective, accurate and repeatable method for automatically detecting ovarian cancer. This method lays the groundwork for the application of early diagnosis of ovarian cancer in clinical. |