The Application And Research Of Improved Clustering Algorithm In Tibetan Medical Diagnosis And Treatment Data

Posted on:2020-10-10

Degree:Master

Type:Thesis

Country:China

Candidate:C Y Liu

Full Text:PDF

GTID:2404330596984451

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

The main produce of clustering algorithm in data mining is applying a method to divide the data into some groups and distinguish the data which has the similar characteristics.The clustering algorithms,which are used in many fields such as commerce,agriculture,network,medicine,play an important role in data mining.So,there are more and more clustering algorithms appeared and cluster analysis has become a hot research field.This paper applied the improved clustering algorithm to the data of chronic atrophic gastritis(CAG)in Tibetan Medicine,then classified and analyzed syndrome types based on the clustering results.Choosing three clustering algorithms which were commonly used in syndrome classification in medicine and applying them to clinical diagnosis and treatment data of CAG was the first step.According to the experiment comparison and evaluation function,k-means clustering algorithm was the best to be chosen to improve.Then,this algorithm was improved by using cosine similarity algorithm in vector space model.Based on the sum of cosine values of the data in a class and the cluster centers of their respective clusters,the improved algorithm was more effective.Secondly,in the light of the clustering initialization method,the proportion selection method was used in choosing initial clustering centers.Combining the k-means clustering algorithm which was improved in the first step,the accurate clustering results were obtained on the basis of the sum of cosine values.According to the simulation experiments,the improved k-means clustering algorithm which combined proportion selection method and cosine similarity had highest accuracy and effectiveness.Therefore,the clustering results were the best.In the end,this paper used original k-means clustering algorithm and improved k-means clustering algorithm to analyze the data of CAG respectively.The experiment consequences showed there existed some differences between the initial cluster centers and final cluster centers.According to the sum of cosine values and iterations,it could be clearly seen that the original k-means clustering algorithm was unsatisfactory.Through the comparative analysis of experiments,it could be concluded that the k-means algorithm which was improved based on the cosine similarity and proportion was the best.Also,the similarity was high between data and center in one cluster.On the basis of the experiment consequences,the syndrome types were obtained and the characteristics of each syndrome type were analyzed.

Keywords/Search Tags:

Data mining, Cluster analysis, Improved k-means algorithm, Proportion selection method, Cosine similarity

PDF Full Text Request

Related items

1	The Application Of Data Mining Technology In Medical Information System
2	The Mining Method Of Clinical Thinking And Regularities Of Prescription Use Of Famous Doctors Through Clustering And Association Rule
3	Application Of Cluster Analysis Algorithm In Thalassemia Disease
4	Identify Genes Associated With Alzheimer’s Disease Based On The Angle Cosine Distance
5	The Research Using Data Mining Technology On Spectral Rules Of Points Selection And Compatibility About Toothache Based On Periodical Literature
6	Study On The Risk Prediction Model Of Elderly Patients Before Operation Based On Cluster Learning
7	Research On Data Mining Method Of Diabetes Risk Based On Electronic Medical Record Analysis
8	Research On The Relationship Between Genes And Diseases Based On Data Mining
9	Research On The Technology Of Decision Support Of Tibetan Medicine Treatment Of The Common Diseases Of The Plateau Based On Medical Data Mining
10	Study On The Traditional Chinese Medical Syndrome Element Of Early Diabetic Microangiopathy Based On Data Mining Method