| Objective:Data mining technology was used to find the association rules among the diagnostic indicators of prostate cancer,and assist to construct a diagnostic model of prostate cancer.Methods: Through the big database management platform of urology surgery,data mining technology was used to retrieve the data of patients admitted to our department from January 1,2015 to December 31,2015 who received prostate needle biopsy for the first time.Python software was used for data processing,and statistical methods were used to determine the indicators that could be included in the diagnosis of prostate cancer.The indicators with statistical significance were included into the item set,and Apriori algorithm and FP-growth algorithm were used to find frequent item sets,and association rules among frequent item sets.Results: 1.After data preprocessing and cleaning,a total of 426 cases of prostate biopsy patients were included in this study,which were divided into 175 cases in the "prostate cancer groups" and 251 cases in the "non-prostate cancer groups".The indicators which should be included in the study were conducted to the statistical analysis.The results showed: "the prostate biopsy needle counts" and "bladder irritation sign" were not allowed to incorporate into the diagnosis of prostate cancer,but,t PSA,f/t PSA,age,clinical symptoms(dysuria,difficult defecation,gross hematuria,ostalgia)and the prostate MRI were all allowed to incorporate into the diagnosis of prostate cancer.2.By FP-Growth algorithm,we dug the indicators which should be included in the study,and set the minimum support as 20.00%.The final results showed that there were 29 frequent item sets associated with prostate cancer,including 5 cases of the frequent 2 item sets,10 cases of the frequent 3 item sets,10 cases of the frequent 4 item sets,and 4 cases of the frequent 5 item sets.3.Used Apriori algorithm to find the association rules from the Frequent item sets combinations to the diagnosis of prostate cancer,and set the minimum confidence as 70.00%.12 strong association rules were finally generated,among which the combination of "S1S2S4S5→A1" had the highest confidence,reaching 75.91%,that is,"when cancer considered in the prostate MRI results,t PSA ≥4ng/ml,the age greater than or equal to 60 years old,and dysuria emerged,the patient had a 75.91% probability to be diagnosed with prostate cancer."Conclusion: 1.By big data analysis technology,an urology department database management platform was set up,and the prostate cancer database was established.2.By Apriori algorithm and FP-Growth algorithm,the strong association rules were dug up among prostate MRI results,t PSA,f/t PSA,age,dysuria,gross hematuria and prostate cancer,which laid a big data foundation of constructing prostate cancer diagnosis model. |