Breast cancer is one of the most common malignant tumors among women in the world.In recent years,its incidence and mortality rate are increasing,and it shows a trend of gradual rejuvenation,which seriously threatens women’s life,health and safety.Nowadays,many medical institutions and scholars are devoted to the research of breast cancer,especially to the prediction of its risk.However,because the pathogenesis of breast cancer is not clear,we can only prevent breast cancer and improve its prognosis by early detection and early treatment.At present,most countries and regions have established breast cancer risk assessment models for different populations based on epidemiological data,but many studies are based on traditional regression methods or single machine learning models.By referring to the theoretical basis and experience of predecessors,combining the advantages of different single machine learning models and the traditional risk assessment model Gail model in applicability and accuracy,a combined model of breast cancer risk prediction is constructed based on the experimental results of various authoritative data sets,and then a software system for breast cancer risk analysis is developed,which can be generally applied to potential risk groups of breast cancer.The empirical data of breast cancer based on the project of Family Registry of Breast Cancer(BCFR)were studied.The data were preprocessed,missing values were filled in,collinearity analysis and the balanced data set was divided into healthy subjects and breast cancer patients.Two machine models,Support Vector Machine(SVM)and Logistic Regression(Logistic Regression),are selected as a single model,and the traditional Gail model is compared horizontally.Study the theoretical knowledge of three models,train two prediction models of machine learning,test the accuracy and recall rate of the models on the test set,and compare the AUC value.According to the empirical results,the results show that Gail model is the best,followed by Logistic regression.Considering the upper prediction limit and performance bottleneck of a single model,combined with the different advantages of a single model,the minimum variance method is used to calculate the weights,and two parallel combined models,Gail-SVM and Logistic-SVM,are constructed.The conclusion proves that the overall prediction effect of the combined model is better than that of the single model.On this basis,the actual needs of users are analyzed.Based on the breast cancer prediction model mentioned above,a breast cancer risk analysis system is designed and implemented by using micro-service architecture.Questionnaires were used to collect user data for storage,My Batis-Plus and Spring Boot framework were used to build the back-end functional module of the system to process business logic,and at the same time,conclusions were fed back according to the prediction results of the model.Vue framework and Element UI component library were used to compile the front-end data page,and E-charts chart tool was used to visually analyze breast cancer data,thus realizing the core functions of the system.In this thesis,a combined risk prediction model is constructed by combining the machine learning model with the traditional breast cancer evaluation model.Experiments verify the effectiveness of the proposed model,and a software system is designed and implemented based on the research results,which provides a practical tool for breast cancer risk prediction. |