Font Size: a A A

The Application Of Decision Tree And Regression In The Influence Factors Of The Health Service Use

Posted on:2012-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:H X LiuFull Text:PDF
GTID:2154330335986637Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Objective: Using decision tree and logistic regression techniques to respectively analyze the factors influing whether the urban and rural residents seeing docators and hospitalized. Study the effect of decision tree and logistic regression techniques combinedly applicating in the residents'medical treatment services. To understand the main factors for medical treatment of the urban and rural residents.With different affecting factors in different resgions, different herlth policies shall be taken to meet the demands of health services of more residents in accordance. To improve health services efficiency and equity, and give references for the health service decisions.Methods: Making use of western of Chongqing expansion area survey data in the fourth national health services survey with SAS9.1 and SPSS17.0 Statistical software. According to types of variables and data, choose CHAID, CART and Binary classification unconditional Logistic regression to analysis. Then to built decision tree and logistic regression models of see doctor or not, hospitalized or not in urban and rural respectively. The effect and variables screened were compared between decision tree model and logistic model.Results: 1. the situation of seeing a doctor and its'multiple analysis: the ratio of not see a doctor in two weeks is 53.07% (urban is 51.79%, rural is 54.48%), the men'ratio of not see doctor in two weeks is 54.66% and the women is 52.58%. Seeing from the Chi-square test, there is no difference between urban and rural area, also no difference between women and man.But age, professional type, employment situation, education level, marital status, medical insurance, have chronic or not, minimum medical distance and self-induction of illness can impact two weeks treatment rate(P<0.0001). The results of multiple analyse seeing a doctor or not shows that the CART Tree is the best model of multiple analyse in urban area. CART Tree has 5 layers and 6 leaf nodes, in corresponding with 6 classification rules. It misclassification rate is 0.198. Per capita income, the minimum medical distance, have chronic or not, cultural degree, age, professional type, marital status and medical insurance are selected into the model. Seeing from the resuts of multiple analyze in rural area, CHAID tree is the best model. CHAID Tree has 3 layers and 11 leaf nodes, in corresponding with 11 classification rules. The misclassification rate of this model is 0.211. Per capita income, the minimum medical distance and self-induction of illness are selected into the model.2. Hospitalization situation and multiple analyse: not hospitalization rate is 36.42% in this area. According to chi-square test, the rates are different in different capita incomes (P=0.0365), different cultural degrees (0.0341) and different medical insurances (0.0047). The not hospitalization rate of the lowest capita income is 41.18%, which is the highest in all capita income levels. In all education levels, the highest rate of"illiteracy"patients not hospitalization is 41.83%. The not hospitalization rate of patients who attended"other social medical insurance"is 75%, which is the highest in all insurance levels. The not hospitalization rate of patients who attended medical treatment at public expenses is 14.29%, which is the lowest in all medical insurance people. The results of multiple analysis of hospitalization or not in urban showed that the Logistic Regression model is the best model. The misclassification rate of this model is 0.220 and the variable of education levels was incorporated into the model. But the CHAID tree is the best model for patients in rural area. The CAHID tree has 1 layer and 2 leaf nodes that are corresponding with 2 classification rules. The misclassification rate is 0.283 and the variable of per capital incomes was incorporated into the model.Conclusion: 1. More than one half of patients did not take medical care in two weeks in this survey. The rate of not seeing a doctor is high. Age, professional type, employment situation, education level, marital status, medical insurance, having chronic or not, minimum medical distance and self-induction of illness influent on seeing a doctor or not. The type of residents and gender has no influence on seeing doctor or not. The results of multiple analysis showed that per capita income, minimum medical distance, having chronic or not, cultural degree, age, professional type, marital status and medical insurance are the main factors for patients to see a doctor or not,in urban area. Per capita income, minimum medical distance, self-introduction illness, medical insurance and having chronicor not were the main factors for patients of seeng a doctor or not, in rural area. 2. Not hospitalization rate is 36.42%, it is higher. Per Capita income, cultural degree and medical insurance have influence on hospitalization. The results of multiple analysis shows that the cultural degree was the main factors for hospitalization of patients in urban area. As the cultural degree is higher and higher, the not hospitalization is lower. The results also showed that per capita income is the main factor for hospitalization of patients in rural area. The probability of choosing hospitalization patients who have a"low-income"is 51.6%.But the probability of hospitalization of patients who have a"middle-income and high-income"is 65.4%. It is much higher than the"low-income"people. 3. The results of model comparison showed that the use effect of Decision Tree is better than Logistic regression model for patients who see a doctor or not between urban and rural area. But in the multiple analysis of hospitalization.In the ananlysis of patients hospilized or not, the effect of Logistic regression model is better than Decision Tree for patients in urban area.But in ruala area,the effect of decision tree is better than Logistic regression model .
Keywords/Search Tags:Decision Tree, Logistic regression, see a doctor, be hospitalized, influent factors
PDF Full Text Request
Related items