What Can We Learn from Burkina Faso COVID-19 Data? Using Phenomenological Models to Characterize the Initial Growth Dynamic of the Outbreak and to Generate Short-Term Forecasts

On 9 March 2020, two cases of COVID-19 were reported in Burkina Faso. As of 10 April 2020, a total number of 484 cases (404 cases in the Kadiogo province) were reported nationwide. Real-time forecasts of COVID-19 are important to inform decision-making in the country. Here, we propose an approach that tests the performance of four models (Exponential Growth model, the Generalized Growth model (GGM), the Generalized Logistic Growth, and Richards Growth model) to select the model that best fit data and to generate short-term forecasting (5-, 10-, and 15-day forecasts from 11 to 25 April 2020) in Kadiogo, the epicenter of the outbreak. Using daily number of confirmed COVID-19 cases, the results suggests that GGM performed the best out of the 4 models. Overall, our GGM predictions suggested an average total number of cumulative cases of 514 (95% CI, 464–559), 629 (95% CI, 559–691), and 750 (95% CI, 661–840) between 11 to 15 April, 16 to 20 April, and 20 to 25 April 2020, respectively. COVID-19 in this province was best approximated by sub exponential growth rather than exponential or logistic growth. Current data suggest that COVID-19 cases would continue to increase over the next 15-days.

economic structure of Burkinabe society, which is largely dominated by the activities of the nonformal sector [14] and which forces the population to live hand-to-mouth, make absolute isolation difficult. In addition, the country-characterized by a weak health system infrastructure [15]-would not have sufficient resources to diagnose all cases or to manage critical cases that require respiratory assistance.
In such a context, characterizing and forecasting the trend of the early outbreak growth profile appears crucial [16,17] for an optimal allocation of prevention measures, medical resources, organization of production activities, and, eventually, maintenance of national economic development throughout the country. Phenomenological approaches in modeling the spread of the disease are particularly appropriate when substantial uncertainties interfere with the epidemiology of the outbreak, including the potential contribution of multiple transmission routes [18]. In addition, phenomenological models provide a starting point for providing early estimates of the potential spread, understanding the evolution patterns of the outbreak, and generating short-term forecasts of the trajectory of the outbreak and forecasts of the final size of the outbreak [16][17][18].
In this present study, four phenomenological models (Generalized Growth, Generalized Logistic Growth, Exponential Growth, and Richard Growth), been validated in previous outbreaks [17,[19][20][21][22][23] (Ebola virus, influenza, smallpox, plague, measles, HIV/AIDS, severe acute respiratory syndrome, Zika virus, COVID-19, etc.), were applied to the data to identify the best model to adequately fit the data, which was then used to characterize the early ascending phase of the outbreak and to assess short-term forecasts in Kadiogo province, the epicenter of the outbreak in Burkina Faso.  [18,24,25].

Data sources
For the purpose of our study, we collected data regarding the number of daily confirmed new cases covering the period from 09 March to 10 April, 2020. These data were aggregated at the national and provincial level. For our study, we focused on the province of Kadiogo, which concentrated more than 83.5% of the cases, and we did not consider the other provinces in order to avoid the effect of demographic noise of small outbreaks (less than 100 cases) [26].

Models
We tested t four phenomenological models in order to select the model that best fit our data. These growth models are defined by differential equations as follows: Formulas (1)-(4) represent the Exponential Growth, the Generalized Growth, the Generalized Logistic Growth, and Richard Growth models, respectively. In these formulas, C'(t) is the incidence curve over time t, the solution C(t) describes the cumulative number of cases at time t, r is a positive parameter denoting the growth rate, p is a "deceleration of growth" parameter, and α represents an exponent that measures the deviation from the symmetric s-shaped dynamics of the simple logistic curve. K is the final epidemic size or carrying capacity.
After testing, the generalized growth model (GGM) appeared to best fit the Kadiogo Province COVID-19 data ( Figure S1, Figure S2 and Figure S3). The GGM (with two parameters-r and p) is an extension of the simple logistic growth model, in which an incremental parameter, p, was added to allow for scaling of growth. The GGM assumes an exponential growth dynamic of the outbreak in the absence of control interventions; this implies that there is an exponential increase in the cumulative number of COVID-19 cases, C(t). To smooth the exponential growth assumption, we considered a simple generalized model [19,27,28], according the formula (2) described above.
In the GGM, the deceleration of growth parameter p varies between 0 and 1, whose variation enables the model to fit sub-exponential or early polynomial growth [19,27,28]. For instance, the model predicts an early exponential growth when the value of parameter p is equal to 1, whereas a value of 0 suggests a constant growth of incidence over time.

Data fitting, calibration, and Short-term forecasts
For the data fitting, we used the number of daily new confirmed cases covering a period of 30 days from 12 March to 10 April 2020, because, we noted that it was four days after the first notification date (09 March 2020) that cases began to be regularly (suspected cases tested) reported on a daily basis, We used nonlinear least squares fitting to fit the data and to obtain the initial appropriate parameters (r and p) capable of minimizing the sum of squared errors between the model and the data.
We calibrated each of the models to the daily case incident reported for Kadiogo province for five days (12 to 21 March 2020), 15 days (12 to 26 March 2020), 20 days (12 to 31 March 2020), and 25 days (12 to 05 April 2020), respectively. We implemented simulation approaches to generate the outbreak trajectories for short-term forecasting based on the uncertainty in the parameter estimates [29]. Based on the best fitting model, the GGM, we performed a short-term forecast for 5 days (11 to 15 April), 10 days (16 to 20 April), and 15 days (21 to 25 April). To generate the confidence intervals at 95% level (95% C.I.) of uncertainty associated with the model estimates, we used parametric bootstrap simulation (M = 500) datasets, assuming a Poisson error structure [30].
To assess the performance of our modeling, we used the Root mean square error (RMSE) as performance metric. It is noticeable that the majority of the population of the country is concentrated in these two major provinces (22.8% of the whole population). However, since COVID-19 has not reached all the provinces, it should be pointed out that it is the western part of the country, followed by the central area, that were affected, if we consider the geographical distribution. Figure 2 shows the evolution of the daily confirmed cases notified between 09 March and 10

As
April 2020, as well as their copulation in the whole country and Kadiogo province on its own. The results showed an increase in the cumulative number of cases in the whole country, with an average of 14 cases carried over per day. However, we noted that it was four days after the first notification date (09 March 2020) that cases began to be regularly (suspected cases tested) reported on a daily basis. Since that date, in Kadiogo province, an average of 13 cases have been reported per day. Due to the relatively small number of cases in the other provinces of the country, we focused our analysis only in the Kadiogo province, seeing as it represents the vast majority of the cases in the country.
As mentioned in the section 2, after testing the four phenomenological models, it was found that the generalized growth model (GGM) was best adapted to the Burkinabe COVID-19 data. The results of the training of GGM are presented in Figure 3 and Figure S1.The results of the training of the other models namely, the exponential Growth, the Generalized Logistic Growth, and Richard Growth models are presented in Figure S2, Figure S3 and Figure S4. The values of the different parameters of the GGM model and the RMSE for the calibration and forecasting period are also presented in Table   1. As the number of epidemic days increases, the value of the growth rate (r) increased whereas the value of the "deceleration of growth" parameter (p) decreases, indicating a decrease in exponential growth.

Discussion
Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 19 April 2020 doi:10.20944/preprints202004.0328.v1 8 of 16 In this study, our findings showed that cases of the COVID-19 in Burkina Faso are concentrated in areas where the population mix is enormous. Moreover, it is easy to observe that from the areas where the number of cases is higher, the neighboring provinces are also beginning to register cases.
This shows that population movements increase the transmission of the COVID-19 [31,32], which provides support for the government's strategy to place provinces with very high cases loads into lockdown and to avoid the "non-essential" movement of people. This strategy, aiming at reducing population movement and increasing physical distance or reducing the frequency of meetings in high-density community settings, such as schools, prayer halls, markets, or workplaces, plays a crucial role in outbreak control [33,34]. While maintaining continuous surveillance in all of the provinces (if possible, disaggregated into health districts and municipalities) of the country, the "nonessential" movements (in and out) from the epicenter province must be controlled as a matter of priority. Based on the available data, if these measures are well implemented and are followed by the population, the situation of COVID-19 could change in a matter of a few days.
We applied phenomenological models to COVID-19 to characterize the early outbreak growth profile in the most affected province in Burkina Faso and to provide short-term forecasts. The COVID -19 pattern in Kadiogo province was best approximated by sub-exponential growth rather than exponential or logistic growth models. Overall, our forecasted values suggested that the outbreaks have progressed slowly and will continue to increase for the next 15 days. Several factors could explained the slower than expected exponential growth pattern in Kadiogo, Burkina Faso: 1) the age and structure of the population, 2) the levels of tests currently performed in Burkina Faso.
The first-and probably most important-factor related to demographics and the population pyramid; that is, the median age is 17.9 years [35], and the life expectancy is 63 years [36]. Large-scale epidemiological studies in China, Europe, and the USA have shown that the median age of COVID-19 patients is approximately 45 years [37,38], but the severity of the disease and the risk of death increase exponentially with age-with patients aged above 65 years being at higher risk [39]. In this context of a relatively young population, it is likely that part of the population is infected but only present limited symptomatology and do not seek medical consultation; in the absence of a systematic screening campaign, such cases are not reported. It has also been previously suggested that cold temperature and low humidity might provide conducive environmental conditions for prolonged viral survival or for other types of SARS-CoVs [40]; the hot climate in Burkina Faso could, therefore, be a protective factor. We could also hypothesize that, due to the highly infectious terrain and the low level of healthcare management, responsible for a high child mortality rate, only individuals with higher resistance to infectious diseases survive.
The second point is that, the data on COVID-19 in Kadiogo, Burkina Faso change daily and only take into account people who are sick and who are rushing to be diagnosed or whom the health center has the capability to test. In fact, screening is not generalized and only concerns specific populations: hospitalized patients, suspected people, and health care staff with suspected symptoms. Therefore, we cannot determine if the low number of observed cases, compared to other countries worldwide and in North African countries or in South Africa [41], are under-reported and diagnostic due to the lower testing capacities. Indeed, over the last month, covering our period of analysis, about 1192 tests were performed in Burkina Faso, among which 29% were positive [42]. This percentage of positive test results is relatively similar to European countries, where this value appears to oscillate between Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 19 April 2020 doi:10.20944/preprints202004.0328.v1 20% and 30% [43]. This would suggest that if the Burkina Faso health system had the capacity to screen widely like they do in other countries, the number of actual cases could far exceed the number of cases currently reported.
A last point is that the higher prevalence of infectious diseases (e.g., malaria) in the Sub-Saharan population could explain a lower sensitivity to this virus.

Conclusion
Our study provided insight into the early spread of the outbreak, which could allow optimal planning of prevention measures and medical resources, as well as organization of production activities throughout the province. Our results suggest that the confirmed COVID-19 cases data in Kadiogo, Burkina Faso exhibits a slower growth pattern that can be best approximated by a subexponential growth model rather than exponential or logistic functions. The findings indicate that COVID-19 cases are likely to continue to increase for the next two weeks.    Conflicts of Interest: Declare conflicts of interest or state "The authors declare no conflict of interest." Authors must identify and declare any personal circumstances or interest that may be perceived as inappropriately influencing the representation or interpretation of reported research results. Any role of the funders in the design of the study; in the collection, analyses or interpretation of data; in the writing of the manuscript, or in the decision to publish the results must be declared in this section. If there is no role, please state "The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results".