| Breast cancer is the cancer with the highest incidence and mortality rate among women worldwide,China has a large population and the incidence of breast cancer is increasing every year,posing a serious threat to the health of Chinese women.Breast cancer screening is one of the effective means of improving breast cancer survival rates.The use of data analysis and computational techniques combined with medical risk preferences for screening breast cancer patients is a current issue that needs to be investigated.Therefore,a diagnostic test evaluation metric based on statistical analysis and incorporating the medical risk preferences of patients and decision makers is investigated.Than use it to build a high-precision breast cancer classification model,which is of great significance for precision breast cancer treatment.The decision curve analysis is used as a graphical statistical overview to evaluate biomarkers or classification recognition models.DCA is built on an expected utility theory framework that takes into account both the benefits of the intervention and the costs of the intervention for patients who will not be benefited.It is generally considered as an improvement to the ROC curve.Therefore,this paper proposes a variety of DCA estimation methods,develops the decision curve theory,and provides a breast cancer classification model based on decision curve for decision makers who need to balance benefits and risks.The main research for this paper includes:(1)Multiple decision curve assessment methods are proposed.Firstly,the existing DCA non-parametric estimation methods are improved,and two DCA non-parametric estimation methods based on Bootstrap method and Bernstein polynomial are proposed.The statistical properties of the new estimators are analyzed.It is found that their variance and accuracy are better than the traditional DCA non-parametric estimation.Secondly,the expressions of the parametric estimation for decision curve based on maximum likelihood estimation is proposed,and the characteristics of the method are studied.The results show that the parametric estimation proposed by us has a simple formula which is only related to the sample mean and variance,and it has some good statistical properties such as consistency and asymptotic normality.(2)Construction of a decision curve-based model for high-precision classification and identification of breast cancer.To illustrate how to apply the parametric estimation method for decision curve into the practice,we detailed its complete process in section4 and take the actual data Wisconsin Diagnostic Breast Cancer datasets as an example.Firstly,to obtain an explanatory variables pool which is made up by as much as possible biomarkers with high identification ability,the parametric estimation method for decision curve is used to screen biomarkers from the original dataset.Then,combined with the positive stepwise regression method,a high-precision breast cancer classification and recognition model was constructed using the area under the decision curve(WA-NBC)index.The model is compared with the breast cancer recognition model constructed by machine learning method.The results show that the breast cancer classification and recognition model based on DCA has higher prediction accuracy. |