Objective:Stroke is one of the most common chronic diseases in the world and a leading cause of death and disability among adults in China.It places a significant economic burden on society and families.Therefore,the aim of this study is to analyse the characteristics of hospitalization and the distribution of hospitalization costs of stroke patients in this healthcare facility,investigate the factors that most influence the hospitalization costs of stroke patients,find effective analysis methods and provide empirical evidence for the prevention and treatment of stroke and the reduction of the burden of this disease.Determining the epidemiological distribution of stroke is of great practical significance as it provides an important basis for empirical analysis of stroke prevention and treatment in the region.Methods:Based on ICD codes I60-I64,the hospitalization costs of stroke patients from2017 to 2019 were selected from the information system of a tertiary hospital in Yuncheng City,Shanxi Province.This study use descriptive statistics to analyse the socio-demographic information of stroke patients and the general characteristics of hospitalization costs,use structural variability analysis to analyse the stroke patients’ average hospitalization cost and the variability of cost by calculating the value of structure variation,degree of structural variation and share of structural variation.Three commonly used machine learning algorithms(random forest,support vector machine and logistic regression)were used to analyse the factors influencing the hospitalization costs of stroke patients,and the factors selected from the different models were ranked in order of importance to identify the important influencing factors.Result:1.Basic information about stroke patients’ hospitalization costs: During2017-2019,the overall trend in hospital costs for stroke patients was downward,with an average(median)hospital cost for stroke patients of 23716.89 yuan.However,the proportion of drug cost was still too high,around 38% for all three years,and the proportion of sanitary material cost was the second highest,around 13% for all three years.Inspection charge changed a lot from 2017 to 2019.Its value of structure variation was 4.51% and structural variability contribution rate was34.09%.Treatment cost,drug cost and sanitary material cost ranked second to fourth,contributing 16.02%,13.76% and 11.94% respectively to the structural change in total hospitalization cost.The total contribution of the four costs was 75.81%.2.Analysis of factors influencing the cost of hospitalization for stroke patients:The top five influencing factors from the logistic regression were: number of days in hospital,departmental consistency,stroke classification,whether operated on and department.The top five influencing factors from the random forest were: number of days in hospital,stroke classification,department,whether operated on and times of hospitalizations.The top five influencing factors from the support vector machine were:number of days in hospital,stroke classification,department,whether operated on and gender.The prediction accuracy of the support vector machine and random forest models were 89.26% and 87.52% respectively.Conclusion:1.After the full implementation of the zero-plus policy on drugs,chronic stroke patients’ drug and care cost experienced a negative movement at this medical facility.All other costs,while decreasing in value,increased in proportion of total costs.This shows that the reform has had an impact on hospitalization costs,but the proportion of drug and sanitary material costs’ proportions are still high.This suggests that managers should focus on the fee structure,improve the compensation mechanism,increase the proportion of fees for technical labour and guarantee a reasonable income for medical staff so that the contribution of medical staff in medical practice can be truly reflected.2.Three different machine learning algorithms were selected to analyse the factors affecting the hospitalization costs of stroke patients.The main factors affecting costs were found to be the number of days in hospital,stroke classification,whether operated on and the department.This suggests that the authorities should carry out health education at key populations,deepen and prefect the tertiary prevention of chronic diseases such as stroke,and strengthen cooperation between medical associations.Comprehensive measures can effectively reduce the number of invalid hospital days for patients,make efficient use of hospital resources and meet the increasing demand for medical services from patients,while reducing the financial burden of the disease on patients and their families.3.Traditional multiple linear regression method can not well suited to the multidimensional nature,large volume,complexity and incompleteness of healthcare data.In this study,we transformed the data and use three machine learning algorithms:Random Forest,Logistic Regression and Support Vector Machine.The results proved that these methods are better and more feasible.There is no superiority or inferiority of the models,only application conditions of models are different.It is recommended that before analyzing the data,the characteristics of the data should be taken into account and the most suitable model should be selected by using multiple models. |