| BackgroundSince the 1970s,with the rapid improvement of human socio-economic level and the increasing globalization of trade,frequent human activities across geographical or ecological boundaries have damaged the environment by some degree.The emergence,outbreak and prevalence of emerging infectious diseases have posed a huge threat to human health.It is of great public health significance to explore the epidemiological characteristics,influencing factors and transmission risks of emerging infectious diseases,and to improve the corresponding prevention and control measures.In December 2019,coronavirus disease 2019(COVID-19)caused by severe acute respiratory syndrome coronavirus 2(SARS-CoV-2)was first reported in Wuhan,China.This kind of infectious disease has quickly affected the world,and was declared as a pandemic.There have been a number of descriptive studies on COVID-19 about the trend of prevalence,clinical symptoms,etiology,immunology,diagnosis and control measures,but the studies on distribution characteristics were less and the conclusions were inconsistent.Lacking studies have discussed about the epidemiological characteristics of COVID-19 in areas with different levels of urbanization,and the epidemic patterns of COVID-19 have not been classified.Studies have shown that the risk of COVID-19 outbreaks in densely populated places is higher.Among them,hospitals,due to their special functions,are potential to become hubs to promote the spread of the virus,thus the healthcare workers(HCWs)need more attention.However,most investigations on the COVID-19 infection of HCWs were single-centered.It was difficult to make an overall description of the COVID-19 infection characteristics and factors of HCWs.Identifying the factors affecting the virus spread is a prerequisite for early warning,prevention,and control of emerging infectious diseases.Previous studies have shown that transportation and meteorological factors can significantly affect the spread of the SARS-CoV-2,but the quantitative results on the impact of different transportation modes remain to be studied,such as road,rail,and air.Studies on meteorological factors had inconsistent conclusions,and lacked considering the interaction between meteorological factors.The previous studies on transmission risk of COVID-19 were mostly based on the time,and less were based on the space.Furthermore,the research scale was coarse,and there was a lack of early warning technology with high accuracy,wide prediction range,and strong practicability.This study used COVID-19 as an outcome variable,described the epidemiological characteristics of COVID-19 in areas with different levels of urbanization within China’s mainland,deeply analyzed the characteristics of COVID-19 infection among HCW,comprehensively explored the transportation and meteorological factors affecting the wide spread and local transmission of COVID-19 and their interactions,and systematically constructed a set of high-resolution prediction model for the spread risk of COVID-19 on multi-spatiotemporal scales.In the new normal phase of the prevention and control of COVID-19 in China’s mainland,this study is expected to provide scientific basis and guidance for the formulation and implementation of public health policies and measures.Objectives1.To describe the distribution of COVID-19 in areas with different urbanization levels,identify the spatiotemporal clusters,Q-mode clusters of epidemic patterns,and the impact of phased policy measures of COVID-19 in China’s mainland.2.To compare the basic characteristics,the time interval from onset to diagnosis,spatiotemporal clusters and epidemic patterns of COVID-19 between HCW and non-HCW cases in Wuhan,as well as explore the influencing factors on COVID-19 onset and deterioration of HCWs.3.To explore the impact of transportation on COVID-19 wide spread,the impact of meteorological factors on COVID-19 local transmission,and the interaction between meteorological factors.4.To use the influencing factors identified above,incorporate other socioeconomic factors or night light data to establish an early warning model for the spread risk of COVID-19 in China’s mainland.Methods1.Data collection(1)COVID-19 outbreak data in China’s mainland at the end of 2019:the case data from December 8,2019 to February 27,2020 were provided by the National Notifiable Infectious Disease Information System,and the case data from February 28 to April 14 came from the official websites of the provincial and municipal health commissions.The data on Beijing COVID-19 outbreak in June 2020 were obtained from the official website of Beijing municipal health commission,and the data on Hebei COVID-19 outbreak in January 2021 came from the official website of Hebei provincial health commission.(2)Population data:the population data at county level were collected from the sixth national census.Population density at county level was calculated by dividing the population of the county by area.(3)Health care data in Wuhan:The list of designated hospitals for treating COVID-19 in Wuhan was obtained from the official website of the Health Commission of Hubei Province.Data on the total number of HCWs,nurses and hospital beds,classification,and type of main hospitals in Wuhan was extracted from the Wuhan Health Statistical Yearbook.(4)Socio-economic data:the geographic information data on railways,freeways and national highways were collected from the National Earth System Science Data Center.Data on locations of airports were collected from OurAirports.Geographical positioning data of supermarkets and shopping malls across the country were extracted from the Baidu map picking coordinate system.(5)Meteorological data:data on daily averaged temperature,temperature difference,relative humidity,sunshine duration,wind,and cumulative precipitation during the epidemic came from the China Meteorological Data Sharing Service System.(6)Night light data:night light data were extracted from the China Remote Sensing Satellite Ground Station,Chinese Academy of Sciences.According to the research objectives,the data were checked,collated,and cleaned to form the modeling database.2.Statistical analysis(1)We used Baidu Geocoder to locate the township of residence of COVID-19 cases,and divided them into three levels of urbanization according to the standard of population data:district,town,and countryside.The attack rate,proportion of severe and critical cases,case fatality rate and the distribution of COVID-19 were compared in different provinces and urbanized areas.Spatiotemporal analysis was conducted at county level.Space-time permutation model was used to identify space-time clusters.The effective reproduction number(Rt)of SARS-CoV-2 in each province was calculated to identify the epidemic pattern of COVID-19,and Q-mode hierarchical clustering of Rt was performed to partition the epidemic pattern.(2)The basic characteristics of HCWs and non-HCWs in different specific units were compared during the epidemic period in Wuhan.The spatiotemporal clustering and transmission dynamics characteristics of the two were also compared.Multivariate linear regression and backward stepwise multivariate logistic regression were used to explore the influencing factors of HCWs’ incidence and deterioration,respectively.(3)Stepwise logistic regression was conducted to explore the role of transportation in the wide spread of COVID-19.Generalized additive model(GAM)was developed to estimate the non-linear effects of meteorological factors on the local transmission of COVID-19.According to the correlation between the independent variables,the "bottom-up" strategy and the Akaike information criterion(AIC)were used for model selection.The penalty spline function was used to smooth and evaluate the pairwise interaction between meteorological factors,and a binary response surface was constructed to visualize the interaction.Linear mixed-effects model(LMM)was used according to the node obtained by GAM to carry out segmented modeling and to quantify the influence of meteorological factors.(4)Maximum entropy niche model was utilized to build the predicting model.Multiple socio-economic factors(including transportation factors)and meteorological factors were incorporated to conduct training,internal and external validation of COVID-19 epidemic data in China’s mainland at the end of 2019.Empirical research was conducted using the data of Beijing COVID-19 outbreak in June 2020 and Hebei COVID-19 outbreak in January 2021.We simulated the COVID-19 outbreaks in key areas and seasons,with 0.1°×0.1° high-resolution nationwide transmission risk prediction conducted.Night light data were used to replace multiple socio-economic factors in the repeated modeling,validation,empirical study and simulation process,and to explore its alternative to multiple socio-economic factors in risk prediction.We used multiple kinds of software in this study,including Microsoft office 2016,ArcGIS 10.2,R 3.6.1,SaTScan v9.6,Maxent 3.3.3k,and Adobe Illustrator CC 2015 et al.Results1.From December 8,2019 to February 27,2020,a total of 78831 COVID-19 cases were reported nationwide,with an attack rate of 59.2 per million people,and the attack rate in different urbanized areas were from high to low in district,town,and countryside.The national overall proportion of severe and critical cases was 18.0%,and the proportion ranged from high to low in district,town,and countryside.The overall national case fatality rate was 4.0%,and the case fatality rate ranged from high to low in district,countryside,and town.The number of cases,severe cases,and deaths reported in Hubei Province accounted for majority of total COVID-19 cases.The overall national sex ratio(male to female)of attack rate was 0.94,but 18 provinces had higher attack rates of male than female.The sex ratio of attack rates of district,town,and countryside valued 0.90,1.12,and 1.20.The median age of nationwide COVID-19 cases was 52(interquartile range,IQR 39,64).The median age of COVID-19 cases living in district was significantly older than that of town and countryside.The three highest proportion in the occupational distributions of COVID-19 cases overall were retired personnel,housework and unemployed,and farmers.The occupation distribution in the district was similar to the overall situation,but the proportion of farmers accounted for about half in town and countryside,with smaller proportion of retired personnel.The proportion of HCWs in district was significantly higher than that of town and countryside.The epidemic curves showed that rapid-increasing phase of case number in town and countryside started later than in district,but the epidemic peaks of the three were basically in the same time.The results of spatiotemporal cluster analysis showed that 24 provincial capitals had clusters.COVID-19 epidemic pattern in China’s mainland could be divided into five categories according to the characteristics of Rt.A series of interventions for COVID-19 had controlled the outbreak of COVID-19 within two weeks,and the relaxing measures such as resuming work and production had not caused a rebound of the epidemic.2.In Wuhan,the attack rate of HCWs was about 4 times higher than that of non-HCW.The hospital with the highest HCW attack rate was valued at 11.9%,but the proportion of severe and critical cases and case fatality rate of HCW cases were significantly lower than that of non-HCW cases.There was no significant difference in the time interval from onset to diagnosis between HCWs and non-HCWs,with the median(IQR)was 10(5,16),but the longest time interval appeared in HCW group.From mid-January 2020,the median time from onset to diagnosis of HCW cases was significantly shorter than that of non-HCW cases.HCW working in county-level hospitals in areas with more non-HCW cases were more likely to have COVID-19.Compared with the HCW cases working in the infectious diseases department,the HCW cases working in general departments,ophthalmology and respiratory departments were more likely to deteriorate.3.After adjusting the population density and the distance to Wuhan,counties intersected by railways,freeways,national highways or having airports were at significantly higher risk of being affected by COVID-19,with adjusted odds ratios(ORs)of 1.40(95%confidence interval,CI 1.14-1.72),2.07(95%CI 1.61-2.67),1.31(95%CI 1.02-1.68),and 1.70(95%CI 1.31-2.22),respectively.The results of the GAM and the LMM showed that the relationships between meteorological factors and COVID-19 were nonlinear.The high attack rate was significantly related to lower average temperature,moderate accumulated precipitation,and higher wind speed.Significant pairwise interactions were found among above three meteorological factors with higher risk of COVID-19 under low temperature and moderate precipitation.Warm areas can also be in higher risk of the disease with the increasing wind speed.4.The internal validation results showed that the area under the curve(AUC)of the receiver operating characteristic(ROC)curve of the original and simplified models was about 0.8,and the internal validation performed well.External validation results showed that both the original model and the simplified model had high prediction accuracy rates for higher and highest-risk areas,reaching more than 70%.The original model had advantages in predicting higher-risk areas,and the simplified model had advantages in predicting highest-risk areas.The contribution of various factors to the prediction was significantly different.The high-contributing factors in the original model were population density,the distance from the epidemic center,and the total number of supermarkets and shopping malls.In the simplified model,the high-contributing factors were the distance from the epidemic center and night light.The empirical research results showed that two models had good extrapolation for the winter outbreak of COVID-19.Simulating the outbreak of COVID-19 in key areas in winter showed that the predicted ranges of two models for the higher and highest risk areas were consistent.Conclusion1.In district of China’s mainland,the risk for COVID-19 was higher in late middle-aged people,female,and retired personnel;in town and countryside,the risk was higher in middle-aged people,male,and farmers.The number of HCW cases were higher in district than in town and countryside.COVID-19 might spread from urban to rural areas.Spatiotemporal analysis classified provinces into five epidemic patterns,suggesting that COVID-19 could quickly cause large-scale effects through wide spread and local transmission.2.In Wuhan,the attack rate of COVID-19 among HCWs was higher than that of non-HCWs.HCWs working in county-level hospitals in high-risk areas were more vulnerable to COVID-19.HCW cases working in general(especially ophthalmology)and respiratory departments were more prone to deteriorate than cases working in the infection department.3.Public transportation was a risk factor for COVID-19 spreading to large areas.Temperature,precipitation,and wind speed could significantly affect the local transmission of COVID-19.Low-temperature areas with well-developed transportation and moderate precipitation were at higher risk of being affected by COVID-19,warm areas with higher wind speeds were also at higher risk of COVID-19.Therefore,countries and regions with above conditions should be given more attention,and effective strategies should be conducted to curb the spread of COVID-19.4.The maximum entropy niche model established in this study had the unique advantage of high accuracy in predicting the transmission risk of COVID-19 in China’s mainland in cold season.The application of night light data could simplify the model,make it convenient for first-line public health workers to use,and provide a methodological reference for the prediction and early warning of other infectious diseases.Innovation1.This study found during the outbreak of COVID-19 from late 2019 to early 2020,the risk for COVID-19 was higher in late middle-aged people,female,and retired personnel in district;the risk was higher in middle-aged people,male and farmer in town and countryside.In addition,this study classified the country’s provinces into five epidemic patterns.These results have not been reported in previous studies2.This study conducted the analysis on the impact of transportation and meteorological factors at county scale,and reported the pairwise interaction of temperature,precipitation,and wind speed on the spread of COVID-19.3.This study established a nationwide high-resolution prediction and early warning model for the spread of COVID-19,and used night light data to replace traditional multiple social and economic factors for model simplification research,which has provided a methodological reference for the early warning of other infectious diseases. |