Objective:Based on the monthly epidemic data of scarlet fever in China from 2004 to 2018,the epidemic characteristics of scarlet fever in the country were analyzed,and the traditional SARIMA model and LSTM model were established for prediction.The data from 2004 to 2017 were used as training data,and the data from 2018 as test set data.By comparing the fitting and prediction of each model,the optimal prediction model was obtained,which could provide decision-making basis for the prevention and control of scarlet fever epidemic.Methods:In this study,scarlet fever data were collected from the Chinese Disease Prevention and Control Information System,and the data of the number of cases of scarlet fever in China from January 2004 to December 2018 were collated.excle was used for statistical and descriptive analysis,and dynamic series and seasonal analysis were used to study the epidemic characteristics of scarlet fever in China.The SARIMA model and LSTM model were established with Python,and the data of the next year were predicted.The mean absolute error,mean square error,mean absolute error percentage and root mean square error were used to evaluate the prediction performance of the model.Results:1.From January 2004 to December 2018,a total of 655,396 cases of scarlet fever were reported in China,with an average annual reported incidence of32,344/100,000.The lowest incidence was 14,570/100,000 in 2011,and the highest was 56,774/100,000 in 2018.2.From 2004 to 2018,scarlet fever data showed a curve rising trend and showed an obvious seasonal "double peak" distribution.The first peak appeared in May and June,accounting for 27.79% of the total number of cases;the second peak appeared in November and December,accounting for 24.90% of the total number of cases;and the epidemic intensity of the second peak was weaker than that of the first peak.3.The top six regions in terms of annual reported incidence were Beijing(14.40/100,000),Ningxia Hui Autonomous Prefecture(10.66/100,000),Liaoning Province(10.05/100,000),Shanghai(9.11/ 100,000),Tianjin(8.73/100,000)and Inner Mongolia Autonomous Prefecture(8.69/100,000).4.The monthly incidence of scarlet fever from January 2004 to December 2017 was used to establish the ARIMA model.According to AIC,BIC minimum side and residual white noise test,the ARIMA(2,1,2)(0,1,1)12 model was determined as the optimal structural model to predict the incidence trend of scarlet fever in China.The predicted results showed that the incidence rates of scarlet fever in China from January to December 2018 were 0.5400/100 000,0.2580/100 000,0.3499/100 000,0.5006/100 000,0.8059/100 000,0.8287/100 000,0.4967/100 000 and 0.2844/10,respectively 10,000,0.2967/100,000,0.4036/100,000,0.6329/100,000,0.7899/100,000.5.The LSTM neural network model was established.The incidence rate from January 2004 to December 2016 was used as the training set,and the incidence rate from January 2017 to December 2017 was used as the internal validation set.The The LSTM neural network model with step size 12,hidden layer 1 and node number 5 was selected to predict the incidence in 2018.The predicted results showed that the incidence rates of scarlet fever in China from January to December 2018 were0.4507/100 000,0.1883/ 100 000,0.3026/100 000,0.4245/ 100 000,0.7255/ 100 000,0.6256/ 100 000,0.2405/ 100 000 and 0.0769/10,respectively 10,000,0.1774/100,000,0.2433/100,000,0.5491/100,000,0.8069/100,000.6.SARIMA model and LSTM model were established based on the monthly incidence of scarlet fever in China from 2004 to 2017,and the incidence of scarlet fever in 2018 was predicted.In terms of modeling performance,the MSE of SARIMA model is reduced by 0.16% compared with LSTM model.MAE decreased by 0.89%;RSME decreased by 1.73%;MAPE decreased by 16.31%.In terms of predictive performance,MSE decreased by 0.46%;MAE decreased by 2.67%;RSME decreased3.09%;MAPE decreased by 18.65 percent.Conclusion:From January 2004 to December 2018,a total of 655396 cases of scarlet fever were reported in China,with an average annual reported incidence of 32344 per100,000.From 2004 to 2018,scarlet fever data showed a curve rising trend,and showed an obvious seasonal "double peak" distribution.The first peak appeared in May and June,and the second peak appeared in November and December,and the epidemic intensity of the second peak was weaker than that of the first peak.From2004 to 2018,the top six regions were Beijing,Ningxia Hui Autonomous Prefecture,Liaoning Province,Shanghai,Tianjin and Inner Mongolia Autonomous Prefecture.Both the traditional ARIMA model and the LSTM neural network model fit the epidemic trend of scarlet fever in China from January 2004 to December 2018 well.The modeling performance and prediction performance of the LSTM model were better than that of the ARIMA model after the evaluation of MSE,MAE,RMSE and MAPE indexes. |