Objective: The incidence of infectious diseases is usually affected by many factors,so there are many difficulties in accurately predicting the incidence of infectious diseases.In recent years,with the development of machine learning methods,it has been widely used in prediction and analysis.Based on the analysis of brucellosis monthly incidence in Hebei Province from 2004 to 2016,this paper constructs a brucellosis prediction model based on machine learning method by using meteorological factors to predict the epidemic situation and development trend,which would provide scientific basis for formulating corresponding prevention strategies.In this paper,three machine learning algorithms,neural network,support vector machine and random forest were used to predict brucellosis.The best model was selected by comparing the prediction accuracy of different machine learning algorithms,which would broaden the thinking for the prediction of infectious diseases and provide more solutions for practical work.Materials and methods: The monthly information of brucellosis in Hebei Province from 2004 to 2016 was collected from the official website of the public health data center.The monthly meteorological data of Hebei Province are obtained form China Meteorological data website.Spearman correlation analysis was used to analyze the meteorological factors and the incidence rate of brucellosis.The meteorological factors which were statistically significant with brucellosis were selected as the input layer of machine learning method,and the monthly incidence rate of brucellosis was used as output layer to establish machine learning method model.The data from 2004 to 2015 were collected as training set,and the data in 2016 were used as the test set to validate it.The best prediction model was determined by comparing the error between the predicted values and the actual values.Results: During 2004 to 2016,43628 cases of brucellosis were reported in Hebei Province,with an average incidence of 280 people per month and an average incidence rate of 3.898(/ million).The incidence rate of brucellosis increased in the past 13 years,and it had obvious periodicity and seasonality.The Spearman rank correlation results showed that there was a correlation between meteorological factors and brucellosis incidence rate.The three machine learning methods constructed by meteorological factors had good prediction effect,the best model was neural network model,the MAPE of prediction result was equal to 0.178,while the MAPE of the ARIMA model was equal to 0.668.Therefore,from the prediction results,machine learning method model is better than ARIMA model,neural network model is the best prediction model,which may have higher value in practical application.Conclusion: There is a significant correlation between meteorological factors and the incidence of brucellosis.Meteorological factors could be used to construct machine learning model to predict the incidence rate of brucellosis in order to achieve early warning.Preventive measures in advance are beneficial to minimize the risk of brucellosis. |