Analysis of potential risk of COVID-19 infections in China based on a pairwise epidemic model

: The ongoing outbreak of the novel coronavirus pneumonia (also known as COVID-19) has triggered a series of stringent control measures in China, such as city closure, trafﬁc restrictions, contact tracing and household quarantine. These containment efforts often lead to changes in the contact pattern among individuals of the population. Many existing compartmental epidemic models fail to account for the effects of contact structure. In this paper, we devised a pairwise epidemic model to analyze the COVID-19 outbreak in China based on conﬁrmed cases reported during the period February 3rd–17th, 2020. By explicitly incorporating the effects of family clusters and contact tracing followed by household quarantine and isolation, our model provides a good ﬁt to the trajectory of COVID-19 infections and is useful to predict the epidemic trend. We obtained the average of the reproduction number R = 1.494 (95% CI: 1.483 − 1.507) for Hubei province and R = 1.178 (95% CI: 1.145 − 1.158) for China (except Hubei), suggesting that some existing studies may have overestimated the reproduction number by neglecting the dynamical correlations and clustering effects. We forecasted that the COVID-19 epidemic would peak on February 13th (95% CI: February 9 − 17th) in Hubei and 6 days eariler in the regions outside Hubei. Moreover the epidemic was expected to last until the middle of March in China (except Hubei) and late April in Hubei. The sensitivity analysis shows that ongoing exposure for the susceptible and population clustering play an important role in the disease propagation. With the enforcement of household quarantine measures, the reproduction number R effectively reduces and epidemic quantities decrease accordingly. Furthermore, we gave an answer to the public concern on how long the stringent containment strategies should maintain. Through numerical analysis, we suggested that the time for the resumption of work and production in China (except Hubei) and Hubei would be the middle of March and the end of April, 2020, respectively. These constructive suggestions may bring some immeasurable social-economic beneﬁts in the long run.


Introduction
A zoonotic coronavirus (named as COVID- 19), firstly detected in Wuhan, a central and crowded city of China, has crossed species to infect human population [1,2]. The symptoms of COVID-19 include cough, fever and difficulty breath, most like SARS (Severe acute respiratory syndrome conronarirus) and MERS (Middle East respiratory syndrome coronarirus) [3]. The period for such symptoms from mild to severe respiratory infections lasts 2 − 14 days. As indicated, the transmission routes contain direct transmission, such as touching or shaking hands and indirect transmission consist of the air by coughing and sneezing, even if contacting some contaminated things by virus particles [3]. Additionally, coronaviruses can be extremely contagious and spread easily from person to person. To date, cases of the COVID-19 coronavirus have been confirmed more than 70,000 -mostly in China, as well as in Japan, South Korea, Thailand, Singapore, Philippines, and the United States of America etc, and with nearly 2000 reported deaths. Apparently, outbreaks of COVID-19 have been become a globally public health concern in medical community as the virus is spreading around the world. The surge of confirmed cases continues even though some control measures such as city closure, extended holidays, travel restrictions and closure of live trade markets have been taken. Some reasons for the rapid growth of cases include increased travel for Chinese New Year, the long incubation period, increased detection, asymptomatic transmission, substantial person-to-person transmission, and continued environmental or animal exposure to a source of infection. As of 24pm on February 17th, 2020, the number of confirmed cumulative cases rose to 72436. The top five provinces in China are Hubei, Guangdong, Henan, Jiangxi, and Hunan, respectively. The reported cases of COVID-19 infections in Hubei province account for about 95% of the infections across China (see Figure1(a)). In order to cut off COVID-19 infection source, Chinese government took an extreme control strategy-prohibiting all the transports in and out of Wuhan and advised all the Chinese citizens to isolate themselves at home at 10am on January 23rd, 2020 [4]. However, more than 5 million people have left Wuhan during the Spring Festival period before January 23rd, 2020 , which increased the risk of COVID-19 infections [5]. The closure of Wuhan changed the behaviors of all the Chinese. As a result, before and after the city closure, there is a substantial change on the pattern of the COVID-19 infections. How to evaluate such a switch strategy play an important role in government decisionand policy-makers. Mathematical modelling is a useful tool to describe mechanisms of epidemic transmission and estimate the efficacy of some control strategies.
Although there are many mathematical models to study the dynamics of COVID-19 infection in China [6][7][8][9][10][11][12], they always assume that individuals are homogenously mixed and ignore the heterogenous contact structure between individuals. In fact, under such an extreme control strategy every person should almost stay at home and substantially reduce their contacts with outside. According to Chinese Statistic Yearbook [13], there are ten composition forms in the household structure. In network terminology, the population can be regarded as a contact network consisting of nodes and links, in which the nodes mimic individuals and links stand for social contacts between individuals among the population. The contact pattern is often characterized by a node degree distribution, where the node degree is the number of contacts an individual has. The contact structure under household quarantine (in this scenario every individual only contacts with its family members) exhibits a unimodal degree distribution (see Figure1(b)). In addition, such contact patterns under household quarantine exhibits strong clustering effects, depicting the closeness of triple cliques in a network. However, the widely used compartmental modelling approaches typically fail to characterize the effects of clustering feature, heterogeneity in contact behavior and dynamical correlations on COVID-19 propagation. In this paper, we devised a pairwise epidemic model to investigate the COVID-19 epidemics based on confirmed cases reported during the period February 3rd-17th, 2020. To our knowledge, many researchers have paid much attention on closure schemes for pairwise approximation for intensive computations and theoretical analysis [14][15][16][17][18]. Since such generated network belongs to a clustering with a homogeneous degree, we took the moment closure approach proposed by Keeling et al. [19,20] to approximate the number of triples.
For each emerging infectious disease, some common concerns include the reproduction number, transmission time interval, peak time, peak values and final size. These quantities determine the potential and severity of an outbreak and are in favor of evaluation of control measures and providing a guidance for slowing down the disease propagation. In this project, we based on data of COVID-19 infections collected from National Health Commission of China (NHCC) [21] to identify the parameters of the proposed model and calculated all the mentioned quantities. In particular, we forecasted the transmission dynamics of COVID-19 in Hubei province as well as in regions outside Hubei. Hubei province is the epicenter of COVID-19 outbreak and has suffered the most severe damage by the epidemic with lack of medical resources to quarantine exposed and suspect cases. In comparison, other regions outside Hubei possess sufficient medical resources and have a relatively minor burden of infected individuals due to imported cases from Hubei. In this work we investigated the effects of the city closure and close contact tracing followed by household quarantine. To our best knowledge, there are few works coupling the data and epidemic models on complex networks to research a specific disease. We tried to build a bridge connecting the two aspects to realize the theorem and practice.

Data collection and description
We collected all the daily reported data on COVID-19 infections from NHCC, where all the laboratory confirmed cases, suspected cases, death-caused by COVID-19, cured cases and close contacts of population with confirmed infections are defined [21,22].
Since the diagnosed approaches changed on 12th February-about 12000 clinically diagnosed cases are counted in the cumulative confirmed cases reported on that day in Hubei province [23]-there exists a singular point for cumulative confirmed cases on February 12th, 2020 (see Figure 1(a)). In order to obtain relatively reliable data, we allocated the sudden increment on 12th February to each day in the preceding week in proportion to the original daily increments of confirmed cases in these days (see Figure 2 for the revised data).
The clustering coefficient denoted by φ, as one of the network characteristic parameters, can characterize the isolation intensity of household quarantine measure for COVID-19 outbreak. The more intensive the household quarantine measure, the larger the clustering coefficient. According to the family distribution in the China Statistic Yearbook [13], we obtained that when every family is completely isolated, φ = 0.5 and when only one family member is allowed to go out for living necessities, φ = 0.42 (see Appendix A for details).

Pairwise epidemic model for COVID-19 infections
The main purpose of this article is to investigate how the novel coronavirus spreads across China after the Chinese government on January 23rd, 2020 made the level-1 public health emergency response to fight against the COVID-19 epidemic by a series of all-out efforts, such as sealing off the Wuhan city, restricting travels, contact tracing, household isolation, and putting those who had close contacts with infected ones in quarantine for medical observation. The measures of household isolation, contact tracing and targeted quarantine often result in relatively intensive family clusters and contact reduction between susceptible individuals and confirmed cases of the disease. Traditional compartment epidemic models typically fail to account for the cluster structure and contact changes among individuals in the population. In this situation, one of the alternatives can be a network epidemic model, where the whole population is considered as a contact network with each node representing an individual of the population and each edge (or link) standing for a contact between a pair of individuals.   Here we proposed a pairwise network epidemic model to incorporate the effects of family clusters and quarantine-caused contact cutting on the disease transmission. As demonstrated in Figure 3, we classified all of the individuals (nodes) into eight different states: unquarantined susceptible (S), quarantined susceptible (S 0 ), unquarantined latent (E), quarantined latent (E 0 ), unquarantined asymptomatic (A), quarantined asymptomatic (A 0 ), confirmed infected (I) and recovered (R). We assumed that individuals in either state of E, A and I are infectious; however, since infected individuals will be immediately put in quarantine once they are confirmed, the individuals in state I are not allowed to transmit the disease. The transitions between different states are as follows. Each S individual turns into the E state when it is infected at the transmission rate β 1 through a contact with an E individual, while at rate β 2 by an A individual. After a latent period of 1/q days, every E individual jumps into either the A class with probability 1 − p or the I class with probability p. The individuals in states I and A run into the R class at the recovery rate γ 1 and γ 2 , respectively. Meanwhile, the individuals in states S, E and A will be isolated and turn into the S 0 , E 0 and A 0 states, respectively, once any one of their neighbors is confirmed to have been infected. Each individual of state S 0 or A 0 will be released from quarantine and enter into the S or R state, respectively, after an observation period of 1/γ 3 days. Whereas each individual in state E 0 will be released from quarantine after an observation period of 1/γ 4 days, entering into either the A class with probability 1 − p or the I class with probability p. It is worth noting that the close contacts among individuals of different states are denoted by the pairwise variables in square brackets as shown in Figure 3. Parameters used in the model have been summarized in Table 2. The number of unquarantined exposed (latent) individuals The number of quarantined exposed (latent) individuals (i.e.,the number of latent individuals who have been contacted with confirmed individuals) The number of unquarantined asymptomatic individuals The number of triples with the joint structure S-E-E [EEA] The number of triples with the joint structure E-E- The number of triples with the joint structure The number of triples with the joint structure S-A-E Using the notations defined in Table 1, the pairwise model for COVID-19 epidemic can be described as follows. Firstly, the time evolution of the number of individuals in different classes is given by the following equations.
Since the time evolution of the number of individuals in each class given by Eqs.
For detailed explanation of the biological meaning of Eqs.
(1-14), the readers are referred to Appendix B. The appearance of the numbers of triples of different types suggests one to close the model by utilizing the moment closure approximation formula by Keeling et al. [19,20] as follows: where n is the average degree, φ is the clustering coefficient, and N is the number of nodes in the underlying network.

The reproduction number
The reproduction number R, which is defined as an average numbers of secondary cases that one case produces during the course of its infectious period, determines the increase growth rate of an emerging infectious disease. Sometime it can be used to judge whether or not one disease breaks out. Adopting the method proposed by Yang and Xu [25] or the next generation matrix method [26], the expression R takes the form of where and Since the pairwise epidemic model (1-14) is a networks model, we mainly concerned the variants of the infected links. The biological meanings of R can be regards as a sum of two quantities. R [SE] is the number of secondary infected links that one exposed link will generate in an entirely susceptible link (1 − φ)(n − 1) during its lifespan 1 β 1 +q as exposed. Note that, (1−p)q β 1 +q denotes a probability that one exposed link survives the expose link and progresses into the infected link. Hence, R [SA] is the number of secondary infected links that one infected link will produce in an totally susceptible links during its lifespan 1 β 2 +γ 2 .

MCMC estimation
Here, we briefly introduced the adaptive Metropolis-Hastings (M-H) algorithm of Markov Chain Monte Carlo (MCMC) simulations [28], by which we estimated parameters and fit data to our model (1-14).
Since we used the reported cumulative confirmed cases z(T) for data fitting, we denoted according to Eq. (3) the expectation of the cumulative confirmed individuals bȳ and its variation with a small perturbation by where z(1) =z(1), T = 1, 2, . . . , 15 corresponds to the days during the period February 3rd-17th, 2020, and δ is supposed to follow the normal distribution N(0, ε 2 ) with ε = 1000. Given the parameter vector Θ, the maximum likelihood function for the revised data Then, the joint posterior distribution of the parameters is given by P(Θ|Z) ∝ L(Z|Θ)P(Θ). Under the assumption of non-information prior distribution [28], the joint posterior distribution of the parameters satisfies P(Θ|Z) ∝ L(Z|Θ). Furthermore, at each iteration step the new parameter vectorΘ is updated through a random walk process and will be accepted with probability min 1, Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 27 February 2020 doi:10.20944/preprints202002.0398.v1 The population size of China (except Hubei) and Hubei province is, respectively, about 1336210000 and 59170000 [13], so we set [S](t 0 ) to be 1336210000 and 59170000 for China (except Hubei) and Hubei province, respectively. Here, t 0 stands for the starting date according to the reported data of suspected cases (identified by contact tracing) who are quarantined for medical observation. The quarantined individuals are isolated for 14 days, so we set γ 3 = γ 4 = 1/14. According to literature [6], γ 2 is about 1/12, and the latent period is about 7 days. However, as the test technique and supply of medical resources improve, especially outside Hubei province, we set the latent period 1/q to 6 days for China (except Hubei) and 7 days for Hubei. Moreover, based on the surveillance data we calculated the average of hospitalization period for confirmed cases to be about 10 days. Thus we set γ 1 = 1/10. The average node degree n in the network is the average number of contacts an individual has in the population. According to the degree distribution (Figure 1(b)) obtained from the data published on the Chinese Statistic Yearbook [13], we got n = 3.1 for China (except Hubei) and n = 3.04 for Hubei. We determined the initial numbers of EE-links, EA-links and AA-links based on the following approximation formula,

Parameters and initial conditions
where [X] or [Y] represents the number of nodes in state X or Y and [XY] represents the number of XY-links. Note that we have set the initial numbers of EE-links, EA-links and AA-links to 1 since the formula (23) gives a real number less than 1. All the parameters and initial values are summarized in Table 2 and Table 3.

Fitting results
Based on the analysis of the diagnosed cases of COVID-19 infection reported in [4], we adopted the MCMC method [28] for 10000 iterations with a burn-in of 6000 iterations to fit the model (1-14) and estimated the parameters and the initial conditions of variables (see Tables 2 and 3). The Metropolis-Hasting (MH) algorithm is used to adapt and readjust the MCMC procedure. Figure  4 shows a good fitting between the model solution and real data, well suggesting the epidemic trend in China under the intervention of the home-quarantine containment strategies. According to the estimated parameter values and initial conditions as given in Tables 2 and 3, we obtained the mean value of the reproduction number R = 1.494 with 95% CI of [1.483, 1.507] for the Hubei province and R = 1.178 with 95% CI of [1.145, 1.258] for the regions outside Hubei province. Moreover, we readjusted the above processes and got good fitting results in the top four provinces in China: Guangdong, Henan, Hunan, and Jiangxi, respectively (see Figure A1 in Appendix C). In what follows, we mainly employed our model with estimated parameters to explore the epidemic behavior of COVID-19 infections in China (except Hubei) and in Hubei.

Prediction of epidemiological quantities
Applying the estimated parameter values, we forecasted that the outbreak of COVID-19 in China (except Hubei province) is expected to peak on February 7th, 2020 (95% CI between February 3rd and February 11st) and the peak value in terms of confirmed infectious cases is predicted to be I max = 6, 253 with 95% CI of [6, 150, 6, 368] ( Figure 5(a)), whereas the peak time of COVID-19 infections in Hubei province is expected to be on February 13th, 2020 (95% CI between February 9th and February 17th) with the peak value of I max = 24, 674 (95% CI [24,314,24, 582]) ( Figure 5(b)). In addition, the COVID-19 epidemic in China (except Hubei) and Hubei province is expected to terminate by the end of March and late April, respectively. The final sizes are forecasted to be 13,757, with 95% CI of [12,678,14  ongoing exposure of human population to COVID-19 contagion in Hubei province is relatively higher than other regions in China. Moreover, limited resources of medical and health cares in Hubei province, especially in Wuhan, the epicenter of the epidemic, have significantly increased the risk of COVID-19 propagation. It is worth noting that the epidemiological predictions presented in this paper are based on the current control strategies. Arguably, the peak time, peak value and duration of the epidemic will change accordingly as long as the disease control measures are modified.

The sensitive analysis
In the face of an emerging disease, the Chinese government has taken many efficacious strategies to control the disease transmission, such as traffic restrictions, isolations, home quarantine, increasing medical resource and propaganda education, etc. To examine and evaluate the potential efficacy of these strategies, we did the sensitivity analysis for two vital model parameters q and φ, which reflect the intensity of detection and isolation, respectively. Figure 6 and Tables 4-5 show that shortening the exposed period 1/q from 8 days to 4 days advances the peak arrival time and decreases the final size, although increasing the peak size of confirmed cases. Also, the epidemic duration advances about one week. Increasing the detection rates as an increase of p is in favor of mitigating COVID-19 transmission.  As Section 2.1 introduced, the intensity of isolation under the household quarantine measure is characterized by the clustering coefficient φ. Table 4-5 shows that, with the increase of φ or the intensity of isolation in China , the the peak size and the final size decrease, the peak time advances and the epidemic duration is shortened. It is worth mentioning that when the intensity of isolation increases, the exposure proportion of susceptible individuals to COVID-19 decreases. It means that both clustering coefficients φ and the exposure proportion of susceptible individuals may affect the spread of COVID-19. Figure 7 presents the contour figure of the final size with respect to φ and the exposure proportion of susceptibles. We could see that with the decrease of φ from 0.5 to 0.1 or the increase of the susceptible exposure proportion from 0.35 to 0.55, the final sizes increase accordingly. In next section, we will further discuss their effects on COVID-19 combined with control measures.

Discussion
COVID-19 is rife in China. The Chinese government has taken many extreme strategies after January 23rd, 2020, which resulting in strong correlation between exposed individuals, asymptomatic individuals, confirmed cases and susceptible people. Moveover, Chinese government set up a series of tracking system in order to follow the trail of each confirmed case and found close contacts of population. Once they found these close contacts of ones, they will be quarantined somewhere (at home or a central isolation). In consideration of the above features, we built a COVID-19 infection model by pairwise approach which couples with the Chinese family network structures and clustering effects, and considered the instant cut-off of close contacts with confirmed cases. At the time of writing, what people concern is when they can return to work and school. As shown in Section 2.1, the clustering coefficient, as one of the network characteristic quantities, reflects the gathering degree of individuals and plays a key role in the modelling process. In our study, as described above, φ = 0.42 represents the gathering degree in current household quarantine measure. Once returning to work and school, people interconnects closely and the clustering coefficient decreases (see Appendix A). Meanwhile, Figure 7 indicates isolation measure may change φ and the exposure proportion of susceptible individuals. Therefore, Figure 8 provides some constructive advise based on our network model. We have seen from Figure 8(a) that in terms of the total population ([S](0) = N) in China except Hubei province, when increasing φ from 0.42 to 0.5 or decreasing φ from 0.42 to 0.1 on the 7th February, the final size and the epidemic duration have small changes. On the contrary, in the terms of partial population ([S](0) = 0.5N), when decreasing φ from 0.42 to 0.1 on February 7th, the final size increases from 13,757 to 17,791 and the epidemic duration delays from 79 days to 100 days, and when the same change of φ appears on March 1st, the final size and epidemic duration changes slightly (see Figure 8 (b)). It suggests that for areas outside Hubei, it would be better to maintain the current control measures and not to resume work and production until the middle of March. Similarly, in Hubei province, when increasing φ from 0.42 to 0.5 or decreasing φ from 0.42 to 0.1 on the 13th February in terms of total population (see Figure 8 (c)), the final size and the epidemic duration have trivial changes, whereas when taking the same changes in terms of partial population (Figure 8 (d)), the final size increases from 74,055 to 98,593 and the epidemic duration delays from 106 to 119 days. Thus, it may be plausible to resume work, production and other gathering activities in Hubei province until the end of April.
The q, one of our model parameters, reflects the strength of testing efforts. Figure 9(a) indicates that when intensifying the medical detection (q from 1/8 to 1/4) on February 7th, 2020 in China (except Hubei), the final size decreased from 13,859 to 13,625 and the epidemic duration advances about one week. However, when taking the same detection change on 1st March, the effects are small. Likewise, similar impacts happen in Hubei province. It means that the medical detection is promoted earlier, COVID-19 ends earlier and the final size is smaller.

Conclusion
An emerging outbreak of COVID-19 in China has lasted almost three months and it brings about 77,046 confirmed cases, 2,445 deathes up to February 23rd, 2020. Facing to the sudden epidemics, the Chinese government instantly has set up a series of extreme control measures including the closure of city, traffic restrictions, close tracing, holiday extensions and even household (family) quarantine since January 23rd, 2020. With the change of these policies, especially the closure of city and close contact tracing followed by household quarantine, the clustering phenomena occasionally exhibits in form of family clusters and hence the transmission patterns of COVID-19 infections have radically altered. Considering such signatures, we proposed a pairwise epidemic model on a network with relevant population structure to address the dynamics of COVID-19 propagation in China. Coupling surveillance data started on February 3rd with our proposed model, the reproduction numbers are evaluated as 1.494 (95% CI [1.483, 1.507]) for Hubei province and 1.178 (95% CI [1.145, 1.258]) for China except Hubei, which are obviously smaller than those published in some literature [6,8,11,12] and they seems to be consistent with the result in [2]. Actually, there are two primary reasons: one is that those values of estimated parameters in the model heavily depend on the choice of the initial surveillance data, the other is that most of existing results ignore clustering effects induced by home isolation as well as dynamical correlations between susceptible individuals and infectious individuals. Interestingly, our model successfully mimics the time surveillance data from February 3rd to February 17th and the solution of model in China (see Figure 4). Moreover, our results can be extended to make  predictions for Guangdong, Hunan, Henan, and Jiangxi, as seen in Figure A1. Based our model, the predicted peak arrival time and cumulative confirmed cases are close to the real data of February 12th and the real number of cases of 72, 582. Additionally, the current strategies should maintain at least until the middle of March in China, otherwise, it may lengthen the epidemic duration and enlarge the epidemic size (see Figures 8 and 9). Our study has both significant theoretic and practical values. The COVID-19 infections in China will be definitely eliminated under continuous timely and effective public health policies and containment measures.
to be friends of each other. Therefore, in this special case of family quarantine, the clustering coefficient is maximal, which takes where F k is the number of families each with k family members, C represents combination and A represents permutation. When one family member goes out for living necessities, we let the number of links for every family increase by one. Ignoring the few addition of triangles, the minimum clustering coefficient in this case is It follows from Eqs. (A1) and (A2) that the addition of inter-household connections (i.e., contacts between different families) leads to the decrease of the clustering coefficient in the entire population. The more inter-household connections, the smaller the clustering coefficient. In the special case when the household quarantine measure is revoked, the resumption of work and production will substantially increase individuals' outdoor connections, giving rise to a relatively smaller clustering coefficient than the value of φ = 0.42 in the case of household quarantine. Consider that every individual adds 7 outdoor contacts after they resume the work, then on average the total coefficient will be about φ = 0.1. In this paper, we adopted the case of φ = 0.1 to study the effects of resumption of work and production.

Appendix B Elaborations of the pairwise epidemic model (1-14)
This subsection elaborates on the dynamical equations of pairwise epidemic model (1-14) presented in the Methods section 2.2. On the right hand side (rhs) of Eq. (1), the first term describes the addition of S individuals as the S 0 individuals are released from quarantine and return to the S state, the second and third terms denote the reduction of S individuals due to infection through contacts with E and A individuals, respectively. The fourth term corresponds to the reduction of S individuals as they are quarantined after being traced to have close contacts with confirmed cases of the disease. On the rhs of Eq. (2), the first and second terms represent the addition of E individuals due to infections caused by infected neighbors in states E and A, respectively. The third term depicts the outflow of E individuals after a lapse of latency period. The fourth term stands for the reduction of E individuals as they are quarantined by contact tracing. On the rhs of Eq. (3), the first and second terms stand for the inflow of I individuals from the E and E 0 states, respectively. The third term denotes the recovery of confirmed cases. On the rhs of Eq. (4), the first and second terms correspond to the inflow of A individuals from the E and E 0 individuals, respectively. The third term is the recovery of A individuals and the fourth term describes the reduction of A individuals due to quarantine by contact tracing. On the rhs of Eq. (5), the first, second and third terms represent the gains of R individuals due to recovery of I, A and A 0 individuals, respectively. On the rhs of Eq. (6), the first term depicts the quarantine of S individuals as a result of contact tracing and the second term describes the event that the S 0 individuals turn back to the S state after being released from quarantine. On the rhs of Eq. (7), the first term represents the quarantine of E individuals by contact tracing and the second term denotes the outflow of E 0 individuals after a quarantine period. On the rhs of Eq. (8), the first term stands for the inflow of A 0 resulting from quarantine of A individuals by contact tracing and the second term expresses the recovery of A 0 individuals.
On the rhs of Eq. (9), the first and second terms refer to the loss of SS-links due to infection through SSE-type triples and SSA-type triples, respectively. The third term considers the decrease of SS-links for the simple reason that one S individual is isolated from contact tracing. On the rhs of Eq. (10), the first and second terms describe the increase of SE-links as a result of infection through SSE-type triples and SSA-type triples, respectively. The third and fourth terms correspond to the reduction of SE-links as a result of direct infection and expiration of latent period, respectively. The fifth and sixth terms depict the loss of SE-links because of infection through ESE-type and ASE-type triples, respectively. The seventh and eighth terms are attributed to quarantine of the S and E individuals, respectively, by contact tracing. On the rhs of Eq. (11), the first term denotes the creation of SA-links because of the expiration of latent period of the E individuals on SE-links. The second and third terms stand for the outflow of SA-links as a result of recovery and infection, respectively. The fourth and fifth terms refer to the loss of SA-links because of infection through ESA-type and ASA-type triples, respectively. The sixth and seventh terms represent the loss of SA-links as a result of quarantine of the S and A individuals, respectively, by contact tracing. On the rhs of Eq. (12), the first, second and third terms represent the addition of EE-links due to infection through the SE-links, ESE-type triples and ASE-type triples, respectively. The fourth and fifth terms account for the outflow of EE-links resulting from quarantine of one E individual on EE-links and EEE-type triples, respectively. On the rhs of Eq. (13), the first term considers the addition of EA-links because one of the E individuals on EE-links jump into the A class after the latent period. The second, third and fourth terms describe the increase of EA-links because of infection through SA-links, ESA-type triples and ASA-type triples, respectively. The fifth and sixth term depict the loss of EA-links as a result of expiration of latent period of E individuals and recovery of A individuals, respectively. The seventh and eighth terms refer to the outflow of EA-links due to quarantine of the E individuals and A individuals, respectively, by contact tracing. On the rhs of Eq. (14), the first term represents the creation of AA-links because the E individuals on the links jumps into the A class after the latent period. The second term describes the loss of AA-links as one of the A individuals on the AA-link recovers. The third term considers the loss of AA-links due to quarantine of one of the A individuals by contact tracing.

Appendix C Fitting results in other four hard-hit provinces
To further calibrate the ability of our model in prediction of the epidemic behavior, we also present in Figure A1 the fitting results of COVID-19 spread in other four hard-hit provinces, namely Guangdong, Henan, Hunan and Jiangxi provinces. Indeed, the model has a good fit to the trajectory of the coronavirus prevalence in these regions. It is worth mentioning that we have adopted the reported data of cumulative confirmed cases of COVID-19 infections from January 24th to February 17th, 2020, as the data in these regions are not much influenced by the data reported in Wuhan, the epicenter of the COVID-19 epidemic.