Font Size: a A A

Disease Monitoring And Research Of Key Technology Based On Data Mining

Posted on:2019-07-26Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y LuFull Text:PDF
GTID:2394330566465447Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Data mining is the process of extracting potentially,valuable information from a large amount of fuzzy,noisy data.Since the reform of opening up to the outside world,China has rapidly emerged,accumulating a large amount of valuable professional data with the coordination and cooperation of the national and local statistical departments.Finding valuable information from these large amounts of historical data to serve regional disease surveillance and individual disease surveillance,as well as assisting diagnosis and disease prevention in disease,has become a hot topic in the current era of research.Taking the malignant tumor disease as an example and combining with the statistical yearbook by the National Bureau of Statistics of China,this paper has conducted in-depth research on data of regional natural environment,biological environment,social environment,humanistic environment and the incidence of malignant tumors in China.The paper has established a multiple linear regression model at last,helping exploring the role of regional environmental characteristics on the cancer.The paper also proposed a plan of establishing a malignant tumor data sharing platform.It will integrate the advanced resources of medical institutions,scientific research institutions,and government agencies by standardizing the structure of the data,providing theoretical basis for national and local decision-making agencies to take disease prevention and treatment immediately.The paper selected 2004-2016 Chinese cancer morbidity and mortality data,2015 China Cancer Village spatial distribution data and related regional characteristics data as the research object through media and visiting surveys,,exploring the relationship between cancer incidence and environmental characteristics of concerned areas via data mining technology deeply.Finally,a cancer data sharing platform system based on browser/client architecture has been designed and implemented through Django framework.The main work of the dissertation consists of the following three points:(1)The paper extracted data about incidence of cancer among Chinese residents from2004 to 2016 via the National Statistical Yearbook and 13 kinds of regional characteristicssuch as economic and natural data for the same period of time,making a study.Using Decision Tree,Random Forests,Adaboost Tree,Correlation Matrices and other methods made the feature extraction come true.Then exploring the the effective weight of regional characteristics on cancer disease preliminarily.(2)The paper collated data on the distribution of cancer villages in Chinese provincial regions in 2015 based on media such as internet,newspapers and academic paper.This paper also extracted data on 12 kinds of regional characteristics such as economic,natural,and other regional characteristics in 2015 from the national statistical yearbook.The study used Decision Tree,Random Forests,Adaboost Tree,Correlation Matrices and other methods made the feature extraction come true,initially exploring the spatial distribution and change rule of cancer villages in China via K-means cluster at the same time.compareing the results of time dimension discussion,the paper selected appropriate input factors to construct multiple linear regression models.Experiments told that the optimized multivariate linear regression model has a goodness of fit of 0.8489 and MSE of 0.0226.It meanings that this model has a good predicted result and could predict the regional number of cancer villages as well as regional status of potential cancer.(3)The paper designed a cancer data sharing platform based on Django framework.Implemented function simulation based on LAN has been conducted successfully.The purpose is to integrate the professional and functional resources of national and local medical institutions,research institutions and government agencies,assisting related departments do well in disease prevention and disease treatment.The test process of platform system showed stable.The designed functions were realized basically.
Keywords/Search Tags:Data Mining, Decision Tree, Random Forest, Malignant tumor, Multiple linear regression
PDF Full Text Request
Related items