Modeling and Analysis of The Early-Growth Dynamics of COVID-19 Transmission

As an on-going pandemic caused by the out-break of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) or simply COVID-19 sweeps through the globe at an unprecedented rate leaving behind trails of high infection and mortality, it is crucial to understand the propagation dynamics of the virus in a host population in order to take urgent and effective remedial and mitigating steps to save life. It is already observed in many countries and communities that accurate and timely testing, tracing, and tracking of the infection lead to better containment and slowing down of the spread. In this exploratory research, the early growth dynamics of infection within a population is pursued based on real data. The study posits that the early growth in a homogenous population follows an exponential pattern motivating further rigorous quantitative treatment based on a number of analytical models such as logistic model, Richard’s model, and Gompertz model– the acceleration pattern of the outbreak is ascertained from the daily inflection data, and regression analysis against population models yields dynamic growth indices which allow very accurate prediction of the successive outbreak size when calibrated continually with updated data. The performance of the various models is evaluated with the real dataset. More, the basic reproduction number of the COVID-19 virus propagation in the community is estimated based on the on-set phase dataset using multi-compartmental epidemiological model. Also, the maximum infection size, infection doubling time and the scope of the herd immunity are also inferred for COVID-19 in a population.


Introduction
The World Health Organization (WHO) upgraded the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) outbreak to a pandemic on March 11, 2020 3,4]. This renders the tracking and tracing of the infected individual a very demanding task. The virus targets the lungs by entering the body typically through mouth and nose as respiratory droplets [5,6]. As a result, the transmissibility of the virus is rather high among communities with close proximity and high contact rate. This puts the countries with higher spatial population density at an extraordinary risk; specially, the countries in South Asia are apparently more vulnerable to the contagion due to a combination of factors ranging from population density, societal structure, culture to constrained health-care facilities. Bangladesh is among the south Asian countries with a significant population size drawing an utmost urgency to analyze and understand the COVID-19 outbreak dynamics.
Globally ranked 8 th for the population size, Bangladesh is home to approximately 161 million people with a staggering population density of nearly 2864 per square miles, which is 7 th highest in the world [7,8]. Given this mammoth sized population packed in this rather limited area makes Bangladesh uniquely vulnerable to large outbreak and potential human catastrophe. Besides, the anticipated size of the infected population will soon overwhelm any healthcare services and would make it difficult for any contingencies or interventions. The first infection broke with three cases on 7 th March, 2020, in Bangladesh, and the Government imposed an emergency 'lockdown' from 26 th March, by declaring full-fledged closure of religious and educational institutions, offices, businesses, shops, malls, and a ban on all social gatherings, making an exception to some emergency services. As of 10 th May, almost 62 days since the first reported cases, the total number of infected people in Bangladesh, according to the official report, stands at 20,995 with a total 314 dead as of 15 th May, and the numbers are mounting daily [9,10]. It cannot be emphasized more, the importance and urgency to analyze the infection dynamics in this spreading phase of COVID-19 in Bangladesh to help in the decision of executing any exit plan to come out of the outbreak with minimal loss of life and economic hardship.
In recent times, for very obvious necessity and urgency, a huge effort is undertaken to understand the infection of COVID-19 pandemic based on the data in different spatial contexts globally, which is reflected by a growing number of research in this area [11,12,13]. Infectious diseases are, typically, modeled as a diffusion process involving a pathogen that disseminates among a given homogeneous or heterogeneous population with a characteristic transmission strength governed by a system of coupled differential equations with variables indicating different compartments the populations are subdivided; e.g., the SIR epidemiological model assumes a homogeneous population to undergo transitions from susceptible ( ) to Infectious ( ) to removed ( ) states as a virus sweeps through; various model variants to this are widely used for understanding the outbreak and for exploring different scenarios to forecast future development of the outbreak, essentially to understand the 'flattening' behavior of the infection in order to impose and modulate emergency public measures like lockdown by gauging the healthcare facilities available to the population [14,15,16,17,18]. Moreover, another theoretical approach, to capture the inherent randomness of infection, is to recourse to stochastic epidemiological approaches based on (generalized) logistic models, and (sub) exponential models [19,20,21].
In this study, a phenomenological approach, based on both stochastic and multicompartment models, have been adopted to model the real dataset of COVID-19 infection in Bangladesh. The study shows that the growth stage follows a very random nature in Bangladesh with typical exponential growth feature. Motivated by the growth trend in the real data, exponential model techniques along with logistic model have been applied to extract trending parameters of the outbreak. Further regression analysis has been conducted to extract high precision fitting of the data with the models, and the fitting parameters have produced a very commendable accuracy in predicting future infection values in the region. Comparative analysis have shown Richard's model and Gompertz model have performed remarkably better than any other parametric approaches in fitting and predicting. The study, further, estimated the basic reproduction number, 0 , for infection in Bangladesh, using multi-compartmental model against the real dataset. The estimation posits 0~3 .0 for the infection which accounts for the observed exponential growth trend in the country. The temporal variation of reproduction is also noted. Further numerical analyses show the scope and rate of the growth of the pandemic in this region is fortunately lower than earlier grave projections.
The doubling time of the infection, the maximum infection population size as well as the herd immunity threshold for the current growth are also estimated and presented in this study.

Phenomenological Growth Model
Phenomenological models are data-driven empirical approaches without any specific basis on first principle or any physical mechanisms [20,22]. These models provide insight in the patterns in the observed data, and so are good candidates to use as the initial probe of investigation, which eventually motivate reproducibility of empirical observation using more rigorous mathematical models. As data suggests, Bangladesh is still in the early stage of epidemic growth with exponential features, so an exponential model such as the following is a good special power-law based method to study its initial growth dynamics.

Logistic Model
The logistic model is a basic epidemiological model with empirical touch, which is characterized by a sigmoid curve signifying an initial exponential growth until reaching an inflection point where the growth gradually slows down to saturation, resulting in the S-shaped profile of the dynamics [23]. It can roughly predict the development and transmission behavior of the outbreak through logistic regression analysis. The rate of cumulative infection, ( ), according to logistic model, may be written as Here k represents the exponential growth rate, and N the total population size with lim →∞ ( ) = . The solution of (2), with the boundary condition ( = 0) = 0 indicating the initial infection incidence, the equation (2) reduces to The model parameters in (3) are , 0 and , and these can be ascertained with fitting and regression analysis against the dataset.

Gompertz Model
Traditionally employed for population growth, The Gompertz model is characterized by a sigmoid function describing growth as being slowest at the onset and at the end of a given time period, and also the asymmetric nature of the resulting S-shaped curve is in contrast to that of the logistic curve [24]. The slow onset behavior as described by the model has rendered it very suitable for modeling the infection growth pattern in Bangladesh. The mathematical form of the Gompertz model may be written as Here, the parameters are , and , with a being an asymptote, lim →∞ ( ) = = , the size of the population; modulates the translation of the curve along the time axis; is the growth rate of the incidence.

Richards Model
The logistic and Gompertz models expound the growth dynamics in profile of a Sshaped sigmoid curve with fixed growth rate due to the rigid model parameters.
Richards model, on the other hand, offers flexibility in the growth profile incorporating additional modulating parameters in the form of exponent, resulting in a power-law formulation of the original logistic model [25]. This renders the model very effective for application in the current COVID-19 outbreak, as the infection incidences in the onset period in Bangladesh is observed to be very random and irregular. The Richards differential equation model for epidemiological application is by Here, r represents the intrinsic growth rate of the epidemic, and s is the modulating parameter offering more freedom in the bending of the S-shaped sigmoid curve. The solution to equation (5)  . Note the model with value = 0 reduces to its traditional logistic model.

Compartmental Model
The counterparts to the phenomenological models are the mechanistic models describing the diffusion process of the transmission of the epidemic within a population as a spatial-temporal dynamical mechanism. Typically the population, in such models, is sub-grouped as compartments consisting of individuals who are susceptible, infectious, or removed, because of recovery or death, along the temporal progression of the infection [26]. Each of these compartments are expressed by an ordinary differential equation with coefficients indicating the virulence of the pathogen, physical contact chances, symptomatic as well as asymptomatic interactions among population, and also the probability of transference of individual among different sub-groups giving rise to nonlinear coupled dynamical system features in the model. Thus, compartmental models are useful for forecasting short-term as well as long-term development of the epidemic, and for assessing various interventions, such as lockdown, to control the speed and trajectory of the outbreak among a population. Compartmental models also help by being a crucial theoretical tool to interrogate the available public health resilience with respect to the predicted infection profile, and the interventions may be modulated along the temporal phases to better respond to infection, and thus the severity of the outbreak can be mitigated. One of the underlying assumptions of the 8 model is the homogeneity of the population with infection prevalence. The high spatial population density of this study case renders compartmental model especially suitable to explore the onset epidemiological states of the population and follow the development. The mechanistic model variant used in this study is the traditional SIR (susceptible-infectious-removed) compartmental model [27,28] governed by the following system of differential equations Here, the three-compartment variables, denoted by ( , , ), evolves spatial-temporally by the dynamic systems (7) -(9) preserving the constraint for the total host population, = ( ) + ( ) + ( ), and and are the non-negative parameters: indicates the transmission rate per infectious individual, while is the recovery rate, so the infectious period is exponentially distributed with the mean 1 . At the onset of the endemic growth phase, for a completely susceptible population indicated by (0)~, the equation (8) can be solved analytically to incur the exponential form ( )~(0) ( − ) . The product of the transmission rate and the recovery period gives the average number of secondary incidences generated by an individual primary infection, which is referred to as the basic reproduction number, 0 (R-naught). In the late part of study, the reproduction number is estimated from the initial infection growth data of Bangladesh. Basic reproduction number can also be used to estimate another useful observable known as the herd immunity threshold (HIT), indicating the fraction of the susceptible population required to undergo immunization for the infection to die away statistically either by vaccine application or natural antibody generation. This value is also calculated for the case nation in later part.

Model Performance Metric
The data obtained for the study is in the form of comprised of the difference between the observed and predicted values, is minimized to measure the goodness-fit of the model. As the performance metric, 2 value [29] or the regression coefficient is evaluated based on the following mathematical form Here, ̅ is the average of the actual cumulative confirmed incidences. It is obvious from the relation that a near unity value for the metric would imply an accurate prediction and so higher performance of the evaluating model. Also, the 95% confidence interval (CI) is estimated based on the t-values for the evaluation of the model fit with the reported data as well as to ascertain the model projection performances.

Data and Tools
Data exploited for analyses and predictions were extracted from the Institute of Epidemiology, Disease Control and Research (IEDCR) [9], a government research institute, and the Directorate General of Health Services (DGHS) [10], an agency of the government, both under the Ministry of Health and Family Welfare, Bangladesh. The World Health Organization's situation reports on COVID-19 [30,31] were also followed as the secondary source, and this was used as a cross-check for any irregularities. The primary fields of the data as resourced from IEDCR and DGHS, containing the daily infection number along with total test counts conducted, daily mortality, and recovery numbers.
The data configuration and management were done in MS Excel 2016, and the model simulation and regression analyses were conducted in Python 3 [32,33] on Windows 10 Pro. The hardware for the study was Intel(R) Core(TM) i3 processor with clock-rate 2.70 GHz.   implies that the logistic model is a good candidate to explore the onset transmission dynamics of a pandemic in contrast to its ability for the long-term projection of the transmission development. We note from the simulation results that Bangladesh is still to catch on with the inflection, which would result in the declining incidence rate.    days as per the global observation [3,4]. Though the simulation is tailored to capture the early-growth dynamics of the transmission, the simulations offer tentative insight into the future projection of the pandemic, given the current rates subsist. Figure 7(a) implies a 250-day duration window for the pandemic in the country, though it is observed that the parameters may change due applied interventions and also due to the intrinsic changes in the variables. Figure 8 depicts the degeneracy in the SIR model, where the reported data may train the same model and fit quite efficiently in every scenario with different host population sizes yielding different parameter values. So, depending on the size of the target population hosting the infection, the initial growth trend may lead to different projected outcomes.

Figure 8:
This panel cluster illustrates the degeneracy involving modeling the early exponential phase of the infection, where the same data may be fitted to a high degree of accuracy to a multiple scenario governed by the SIR-Model across different points in parameter space. Across the panels (a)-(d) the vulnerable population size has been varied from 50K, 75K, 100K and 125K, respectively. Notably in each case the fit of the data (black dots) to the model generation (red profile) score high on evaluation scale (equation 10) and also yield varying degree of secondary infection proliferation ( 0 ). The insets are the close-ups of the corresponding data regions. This observation implies that the model projection is constrained to short temporal scale.

Basic Reproduction Number ( )
Compartmental model offers a quantitative way to compute the basic reproduction number or 0 ( -naught) of the prevalent COVID-10 pandemic in a host population.
Basic reproduction number is an important gauge to ascertain the virulence of an infection in a population and a risk assessment tool; it is expressed as the average secondary infection size from the one primary infected individual.

Herd Immunity Threshold
Herd immunity threshold (HIT) represents the fraction of host population that needs to undergo immunization against the virus to potentially neutralize the infection [36,37,38]. The fraction may be calculated in terms of basic reproduction number given by 1 −

Doubling Period
Doubling period is a useful long-term metric to probe the temporal developmental . For COVID-19 in Bangladesh, we compute the latest doubling period for the study period to be 5.16 days, which is comparable to the early stage doubling periods in global cases [39].

Comparative Analysis of Model Performance
In this study, we use both phenomenological and mechanical models to train and test for the on-going COVID-19 infection trend in Bangladesh. In our simulation, we find the Gompertz and Richards models perform the best in capturing the early growth trend as observed in the reported data. Table 1 Figure 6) and Gompertz (equation 4 with the model parameters as in Figure 5) models. The bounding contour around the red model generation indicates the 95% confidence interval (CI) of the future projection by the model down the infection progression. The narrower CI along the dataset on the uplifting stage indicates the high degree accuracy in the model fitting, and also in the projection until 90-day period shown in the computation, holds to a high accuracy.

Conclusions and Outlook
Bangladesh continues to see an unabated exponential growth of the COVID-19 infection, even during the preparation of the paper, after 60 days since the reported case in the country, though the suppressive intervention measures are in place. So, dynamic epidemiological modeling approach to understand the underlying nature and peculiarities of the infection diffusion dynamics in the country is imperative to ascertain and to interrogate the effectiveness and defectiveness of interventions [40,41]. Our study explores the available real data in this early phase through the pandemic, despite the inevitable constraints involving the size and scope of the data, and attempts to depict a scale up assessment of the pandemic throughout the country, employing both phenomenological and mechanistic models. The spatial-temporal dynamics explored in this study can be generalized beyond the premises considered, and may be used to understand the pandemic in other countries, especially those sharing similar demography. We may earmark few key conclusions of the study as following: 1. The pandemic growth in the country paves a strong exponential growth pattern.
2. The study finds population dynamical models based on the power-law featured growth, such as Gompertz model and the generalized logistic model counterpart, 4. The Herd Immunity Threshold (HIT) estimated in the study asserts that 65.3% of the population needs to achieve immunity in order to get out of the pandemic.
5. The study finds that the mechanical model, based on the current available data, estimates the doubling period for the infection in Bangladesh is 5.16 days.
6. The study estimates the upper bound of the total infection size in Bangladesh; according to the estimation, the current growth trend projects about 45.9 million people may be infected by coronavirus.
The epidemiological modeling performance study conducted here offers a crucial insight into the dynamical features and numerical measures of the outbreak size in Bangladesh, which can be used as guiding tools to assess the responses and outcome by continuous monitoring of the situation. The basic reproduction number is a crucial indicator of any pandemic, and it must be computed and monitored on a regular basis to undertake and assess the suppressive and mitigating phases during the pandemic, more so at the onset phase, as the response interventions to safe-guard both life and economy of the country. The theoretical modeling and simulation as pursued in this study is an important step towards development of automation-based tools and machine learning techniques (e.g., [42]) to further the understanding of dynamical aspects of the outbreak and gauge various response protocols.