Font Size: a A A

Research Of Ovarian Cancer Protein Mass Spectrometry Data Analysis Model

Posted on:2017-01-01Degree:MasterType:Thesis
Country:ChinaCandidate:L L CuiFull Text:PDF
GTID:2284330485481267Subject:Biochemistry and Molecular Biology
Abstract/Summary:PDF Full Text Request
The expression abundance of proteins and peptides of carcinoma tissue in pathological conditions will appear abnormal change. An optimal set of markers among specific cancer peptides or proteins expression can be clinically utilized to build assays for the diagnosis of specific cancer.One of the main objectives of this paper is to introduce a method based on LMS algorithm and Relief algorithm, which is more accurate when classifying MALDI-TOF low-resolution data. This method including local maximum search (LMS) peak detection algorithm, Relief feature selection algorithm, and support vector machine learning classifiers. For Ovarian Dataset 8-7-02, LMS algorithm was a very effective peak detection algorithm with parameter optimization. Relief has a good performance in feature selection. As for the learning classifier, Support Vector Machine was performed with respect to the expected testing accuracy and achieved a satisfiable performance of identifying cancer and the healthy. The best parameter set for LMS were achieved with control variable method, with the result of an average accuracy of 99.9738%(sd=0.0018) and an average specificity of 97.7437%(sd= 0.0109) in 1000 independent 10 -fold cross validations.The other main objective of this paper is to propose an improved feature selection algorithm based on feature weighting. The new algorithm combine score from f-value with weight from relief, which is more accurate when classifying high-resolution matrix-assisted laser desorption and ionization time-of-flight mass spectrometry (MALDI-TOF MS) data. We have developed a four-step strategy for data processing based on:(1) Align the study sets by binning of raw MS data, (2) local maximum search (LMS) peak detection, (3) a new combination feature weighting selection algorithm and (4) support vector machines achieve a satisfiable performance of identifying cancer and the healthy. The best parameter set for LMS were also achieved with control variable method, which achieve an average accuracy of 97.4167%(sd=0.0146) and the best accuracy of 98.6111% in 1000 independent 10-fold cross validations.
Keywords/Search Tags:Mass Spectrometry (MS), Peak detection, Local Maximum Search, Feature weighting selection algorithm, Relief
PDF Full Text Request
Related items