Font Size: a A A

Research And Application Of User Health Classification Method

Posted on:2019-10-10Degree:MasterType:Thesis
Country:ChinaCandidate:H Y ZhangFull Text:PDF
GTID:2428330545988409Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet information technology,computers and information technology have been applied to every corner of social life.Health information is to use the development of the current Internet industry to turn health information services from the offline medical community and hospital to the Internet.People are eager to understand their health through some simple devices and some simple ways to prevent the occurrence of some diseases.In the current practical application of health informatization,user health is usually classified by statistical methods.However,some statistical methods require not only the personal health and family history of the users,but also the proportion of people exposed to a certain risk factor in the same sex age group,which requires a considerable amount of work.This thesis proposes two classification methods,based on Harvard cancer risk index classification method and Apriori weighted Bias classification method(APNBC classification method).The classification method based on Harvard cancer risk index model is a statistical classification method,the method of using health data and medical health threshold given by the users are analyzed,and the calculation formula is used to calculate the Harvard cancer risk model,finally obtains the user's health risk status.To a certain extent,this method can calculate the degree of disease risk.However,this thesis finds that this method not only requires detailed user health data,but also needs the proportion of the same age risk in the sex age group.Then,this thesis studies a simple and very practical classification algorithm,simple Bias classification algorithm,in machine learning.One of the important prerequisites of the algorithm is that the requirements between attributes are independent,but all things are connected,and it is difficult to be unrelated.In order to reduce the effect of the algorithm on attribute independent of this requirement,this thesis proposes the APNBC classification method,this method combined with the Apriori algorithm in association rules to weaken the relationship between the classification of condition attributes influence on classification,reduce the dependence between the attributes,improves the accuracy of classification.The thesis mainly work are as follows:1)This thesis proposes a classification method of Harvard Cancer Risk Index Based on the model,this method is a kind of classification method according to the property of medicine provides the value reference standard to design,through health data users regularly upload for health assessment value,comparative analysis is obtained after the description of the health situation of the two tuple.Finally,using the Harvard cancer risk index model to calculate the risk of disease.2)In view of the data missing in data preprocessing,an improved k-means method(LKM method)is proposed to deal with the missing data.The improved method combines the advantages of hierarchical clustering and k-means algorithm,and overcomes the shortcomings of the original algorithm.The basic idea of this method is first to carry out hierarchical clustering,get the number of k and initial clustering centers,then use k-means algorithm to refine.Finally,through experimental comparison,it is proved that the improved method can get high-quality clustering results.3)By analyzing the advantages and defects of the simple Bias classification method,this thesis puts forward the APNBC classification method.This method combined with the Apriori algorithm in association rules found the key attributes,attributes eliminate the associated attributes of health,calculation of key attribute weights and the weighted on key attributes,in order to reduce the dependence between the attributes,the calculation formula of proposed APNBC classification method after weighted.In order to verify the availability of the APNBC classification method,this thesis adopted the experimental medical health data set compared with the Pima Indians,put forward the classification method of Harvard Cancer Risk Index Based on the model,the experimental results show that the APNBC classification method is effective and available.4)In view of the current situation of lack of practical application,this thesis applies the proposed classification method and designs a user health classification prototype system by Android technology.First of all,the user health classification method in Internet environment is deeply studied theoretically.A classification method based on Harvard cancer risk index model is proposed to classify user health,and the advantages and disadvantages of the method are analyzed.Secondly,based on the optimization of k-means algorithm,LKM method is proposed.Through 6 sets of UCI data sets,6 groups of comparative experiments on k-means and LKM methods are carried out.It is proved that LKM method can find clustering targets in a shorter time.Then,this thesis proposes a APNBC classification method,which reduces the dependency between attributes through the Apriori algorithm.In order to prove the effectiveness of APNBC proceedings will be the traditional NBC algorithm,and the hidden Naive Bayesian algorithm were compared with experimental data than the Indian health,found the APNBC classification method improves the classification accuracy of about 5% compared to NBC algorithm,improves the classification accuracy of about 3% than the implicit Naive Bayesian algorithm proved to be effective APNBC classification method.Finally,the user health classification method is used to design the user health classification prototype system,and the empirical study is carried out.
Keywords/Search Tags:Harvard cancer risk index model, k-means, association rules, Apriori, Naive Bayes
PDF Full Text Request
Related items