Data Mining Applications And Research In The Classification Of Hypothyroidism

Posted on:2011-12-05

Degree:Master

Type:Thesis

Country:China

Candidate:T Long

Full Text:PDF

GTID:2204360305959485

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the development of medical information and the increment of diagnostic data, it is necessary to extract the potential and significant knowledge using the deep analysis of data mining technology.The current research based on hypothyroidism classification mining is not good enough to determine the advantages and disadvantages of classification models,because it comes from the perspective of medical analysis, statistical theory, or the single data mining model, not combing with statistical method and data mining, and failing to compare and analyze the variety of data mining models comprehensively.In this paper, researches the different datas of hypothyrodisim from the statistical methods and practical application, and compares with different classification models to make up the current deficiency. Makes a comprehensive analysis of the performance of three models from the variable demands, data robustness, time cosuming, result interpretation, classification accuracy, performance scalability and many other factors, also provids a referencing and guiding significance to the clinical diagnosis of hypothyroidism.This paper contains the following aspects:1) Introduces the concepts of data mining technology and major applications, analyzes the CRISP-DM data mining process in the various stages of implementation, and the corresponding results deeply. Takes a more deep business understanding of hypothyroidism classification combing with research and application. At the same time, conducts in-depth exploration of hypothyroidism properties in the data understand process, so that making the training set and testing set more general and representative. Analyzes and pre-processes the fields relevant with missing values, outliers, useless or redundant attributes in the data preparation process.2) Researches the main method, mathematical principle and application of the discriminant analysis, Logistic regression and CHAID decision tree, explores the similarities, differences and mutual relations of the statistical methods and data mining, based on the statistical theory and data models. Determines the main indicators of hypothyroidism classification through the establishment of appropriate data mining model. Makes a more comprehensive statistical analysis of data mining algorithms and research of the mining models to find optimum with a variety of statistical methods and principles of data mining model combination, carries out a further measurement and comprehensive analysis of three models from different factors.3) Uses the CRISP-DM data mining standard process for systematic hypothyroidism research and development to grasp the six stages of the implementation process from the whole and detail views in Clementine12.0 development environment. Takes the data mining work in a structured, systematic, standard, and visual process. Uses the Script language to develop the whole process of data mining to improve those manual, repetitive, time consuming tasks, and also help to achieve the automatic process and batching process in the user interface.

Keywords/Search Tags:

hypothyroidism data mining, CRIPS-DM, discriminant analysis, Logistic regression, CHAID tree

PDF Full Text Request

Related items

1	Study On The Strategy Of Classification Methods In Data Mining And Their Applications In Biomedicine
2	The Study Of Subhealth Status And Cause Investigation Of College Students Based On Logistic Regression And Decision Tree Model
3	Analysis Of Influencing Factors Of Hypoxemia During One-lung Ventilation Based On Logistic Regression Model And CHAID Decision Tree Model
4	Multivariate analysis techniques: Multiple discriminant analysis and multiple logistic regression. A study in the theoretical background and application of multiple discriminant analysis and multiple logistic regressio
5	The Application Of Combining Decision Tree With Logistic Regression To Analyze The Effect Of Practicing The New Type Rural Cooperative Medical System(CMS)
6	Type 2 Diabetes, Discriminant Analysis And Logistic Regression Analysis
7	Clinical Study Of Conditional Logistic Regression And Data Mining On Risk Factors Of Nonsyndromic Cleft Lip And Palate
8	The Establishment And Validation Of The Discriminant Analysis Model For Bone Metastases In Newly Diagnosed Prostate Cancer Patients
9	Data Mining-based Subhealth Analysis Of Chinese Software Programmers
10	A logistic regression and discriminant function analysis of enrollment characteristics of student veterans with and without disabilities