1.School of Public Health, Nanjing Medical University;2.Jiangsu provincial center for disease prevention and control
目的：探究季节性差分自回归滑动平均(Seasonal Autoregressive Integrated Moving Average, SARIMA)模型和长短时记忆神经网络(Long-Short Term Memory Neural Network, LSTM)模型在江苏省猩红热发病趋势预测中的应用，为疫情防控工作提供理论依据。方法：以江苏省2005~2017年猩红热的逐月发病数据与气象数据作为训练数据，拟合SARIMA模型和LSTM模型，以江苏省2018~2019猩红热的逐月发病数据与气象数据作为测试集，检验模型的预测精度。结果：江苏省猩红热流行具有明显的季节性特征，每年4~6月和11月至次年1月为高发时间段。最优SARIMA模型为SARIMA(3,1,2)(1,1,1)12，最优LSTM的结构为以过去3个月的发病数据结合气象公因子作为模型输入，当前周期发病数为对应的期望输出，建立4层LSTM，每层包含32个长短时记忆神经元，以及1层全连接层。两种模型测试集的平均绝对误差百分比分别为35.97%，16.94%；均方根误差分别为227.85，152.46。提示LSTM神经网络的拟合效果和前瞻性预测精度优于SARIMA模型。讨论：LSTM模型对江苏省猩红热发病趋势拟合和预测效果较好，可以用于流行趋势研判和风险评估，为优化和调整猩红热监测、防控策略和措施提供依据。
To compare the performance of Seasonal Autoregressive Integrated Moving Average (SARIMA) model and Long-Short Term Memory (LSTM) model in predicting the epidemics of scarlet fever in Jiangsu Province, China, which may provide decision basis for prevention and control of scarlet fever. Methods: SARIMA model and LSTM model were fitted with data of monthly scarlet fever cases as well as meteorological variables during 2005-2017 in Jiangsu Province, China. Performance of the models were evaluated with data during 2018-2019. Results: The epidemics of scarlet fever in Jiangsu Province showed obvious seasonality, higher incidences were observed from April to June and from November to next January. SARIMA (3,1,2)(1,1,1)12 had the best performance of all the alternative SARIMA models. The optimal LSTM model had 4 LSTM layers and 1 full connected layer, and each LSTM layer contains 32 memory cells. The Mean Absolute Percentage Error (MAPE) of SARIMA model and LSTM model in testing set were 22.47% and 16.94% respectively, and the Root Mean Squared Error (RMSE) were 227.85 and 152.46 respectively. Conclusion: LSTM model performed well in predicting the incidence of scarlet fever in Jiangsu province. This model can be used to investigate the prevalence trends and assess the epidemic risk of scarlet fever, so that to provide basis for optimizing and adjusting monitoring, prevention and control strategies and measures of this disease.