Font Size: a A A

An Analytical Study On Lysophospholipid Dataset For Ovarian Cancer Detection

Posted on:2012-11-04Degree:MasterType:Thesis
Country:ChinaCandidate:S F ChenFull Text:PDF
GTID:2120330335462658Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Ovarian Carcinoma (OvCa) is the most lethal type of gynecological cancer. More than 75% of diagnoses are made in stageâ…¢andâ…£, however the 5-year survival rate for women diagnosed at late-stage is 25%, this would be more than 90%if diagnosed at stage I. In China, the incidence of ovarian cancer in women presents persistent high growth, hazarding the most women's health.According to clinical research, lysophospholipid is reported to be relevant to OvCa, in this study, we identify biomarkers from lysophospholipid data for early OvCa prediction. Now biomarkers selection methods are extensively used in OvCa classification and diagnosis. However many studies only focused on improving classification accuracies and ignored the balance of the sensitivity and specificity. Moreover, most of the built models include too many variables, which make the models clinically hard to apply.This paper is based lysophospholipid dataset. As we have identified that batch-to-batch variations obscures the biological effect. We present a new data preprocessing method to remove the batch-to-batch variations in this study. Technically, we combine Singular Value Decomposition (SVD) and Monte Carlo Decision Tree to correlate biomarkers with cancer diagnostic outcomes and fine tune the selected biomarkers set in terms of classification accuracy respectively. For the problem of too many variables, this paper presented a novel simple two-stage model based on minimal number of biomarkers. Statistical permutation test shows that the results based on our simple model are comparable to the complex four or more variables models. It maintains a high classification accuracy and uses less variables as well therefore is more cost effective.In the first part, this paper introduces the background of ovarian cancer, and briefly discusses the significance of this research project.The second part presents a new data preprocessing method, and researches the influence of batch, plasma and serum in classification.The third part presents an approach to feature selection for early ovarian cancer detection. This method combines Singular Value Decomposition and Monte Carlo Decision Tree to correlate biomarkers with cancer diagnostic outcomes and fine tune the selected biomarkers set in terms of classification accuracy respectively.The fourth part presents a novel simple two-stage model based on minimal number of biomarkers. Statistical permutation test shows that the results based on our simple model are comparable to the complex four or more variables models. It maintains a high classification accuracy and uses less variables as well therefore is more cost effective.Finally, give the summaries and prospects of the full text. In summing up the work of this thesis and predicting the direction of future research.
Keywords/Search Tags:ovarian cancer, singular value decomposition, monte carlo decision tree, feature selection, stability, minimal number of biomarkers, sensitivity, specificity
PDF Full Text Request
Related items