| Esophageal squamous cell carcinoma(ESCC)is a kind of cancer with high incidence.From a worldwide perspective,China is one of the areas with high incidence of ESCC.Traditional diagnosis and prognosis of ESCC basically depend on doctors’ clinical experiences.In recent years,with the development of computer science and technology,this situation have changed with the help of big data and artificial intelligence based precision medical technology.However,the construction and analysis of big data platform in the diagnosis and treatment of ESCC is still in its starting stage.Funded by Shandong Provincial Key R&D Program “Key technologies of individualized radical chemoradiotherapy for ESCC based on multi-center clinical research queue”,in this paper we used the clinical data of ESCC from Shandong Province Cancer Research Institute between 2013 and 2017 to be research object,and constructed an application platform on clinical data for ESCC.Then we continue predict distant metastasis in patients with ESCC based on this platform.This paper has completed the following three aspects:(1)In the first part,we import the clinical data that derived from the Hospital Information System(HIS)of Shandong Province Cancer Research Institute into the local ESCC database.Firstly,the database is designed according to the fields provided by the hospital.Secondly,we establish the database of ESCC for store the clinical data.After data processing,we import the clinical data into the corresponding table of the database through the script.The clinical data include patient demographic information,disease history information and current medical history information and so on.(2)In the second part,we design and implement a clinical application platform for ESCC based on Java Web platform,which integrates Spring,MyBatis and Shiro.The role and authority management are completed in this platform.Also it can display clinical data that include patient management and laboratory examination in both text and statistical charts.(3)In the third part,we use imbalanced learning for prediction of distant metastasis in ESCC.This study proposes a novel scheme that blood cell analysis is used to predict distant metastasis in patients with ESCC,which is intended to replace imaging,puncture and surgery.In real,the number of ESCC patients without distant metastasis is more than the ESCC patients with distant metastasis,resulting the high imbalanced data.To deal this imbalanced problem,we adopt oversampling algorithms on the data level to increase the proportion of positive samples so that the dataset reaches balanced between the positive and negative class samples.For comparision,we also apply two cost-sensitive learning methods in order to get more attention to the positive samples.Finally,we try debugging the positive sample ratio to get the best oversampling ratio.The experimental results show that the Area Under the Curve(AUC)and G-Mean are the best when the proportion between positive and negative samples is from 0.9:1 to 1.1:1.It’s very important to promote the precise treatment that we construct the clinical data platform and predict the distant metastasis in ESCC.These two parts of the work lay a good foundation for the next step in the development of individual treatment programs. |