COVID-19 India prediction model based on the trend from other countries

This paper is an attempt to present a COVID-19 prediction model for India. Lockdown plays an important role in the arrest of community spread of the disease. This was evident from the study of other countries such as Russia, Belgium and Germany, where peak cases were recorded within a month of the imposition of lockdown, that it showed an immediate positive effect. However, in India, even after 65 days of lockdown, there is no decrease in the number of daily new cases reported. There were many models prepared for India and almost all of them were proven wrong by the increase in the number of cases. The model in this paper is prepared using the COVID-19 trend in other countries, population density and the pandemic bell curve. Based on the available data until 24 May 2020, two scenarios have been presented. In one, the peak shall be obtained when the number of daily new cases per million reaches 190 and in the second when the daily new cases per million reach 724. One model predicts the number of cases to reach 1 million by mid-July 2020. The other model predicts the number of cases to peak by mid-July with the total cases reaching 20 million. The predicted cases were compared with the actual cases recorded for the period 25 May to 11 June 2020. It was observed that the actual values matched quite reasonably with the predicted values.


Introduction
Since the first case of COVID-19 was reported from Wuhan, China, there have been many attempts to predict the spread of the disease through modelling the previous data and other factors. In India, the first case was reported on 30 th Jan 2020 in Kerala and since then the cases have reached more than 0.2 million as of now on 3 rd June 2020, after 5 months. Some attempts have been made to predict the number of cases, but most of the models could not correctly predict the total number of infected people. A study by Singapore University of Technology (Luo, 2020) predicted that in India around 97% cases shall end by 21 st May. IIT, Guwahati in association with Duke-NUS, Singapore , analysed state-wise COVID-19 situations in states like Andhra Pradesh, Delhi, Gujarat, Madhya Pradesh, Maharashtra, Uttar Pradesh and West Bengal. The researchers used the data available until 1 st May for analysis. However, they did not predict the overall cases in the country.

Prediction models for India
Several other authors who have also tried to model the outbreak of COVID-19 in India. Ranjan, 2020, predicted in March that based on the SIR model, it is estimated that India will enter equilibrium by the end of May with the final number of cases being approximately 13,000. However, he reported that this estimation will be invalid if India enters the stage of community transmission. Although no community transmission has been reported in India yet, the predicted numbers have been greatly exceeded. There have been a considerable number of studies published in April which also predicted the number of cases and the time when the curve would flatten in India. Bhatnagar, 2020, andArti, 2020 do not predict the number of people infected in India. They conclude that the lockdown and social distancing plays an important role in restricting the spread of the disease. However, Bhatnagar's model predicts the cases of Italy and France quite accurately. Tiwari et al., 2020 used the time series forecasting method and predicts that the total number of confirmed cases of COVID-19 might reach around 68978, and the numbers of deaths due to 1557 around 25 th Apr 2020, in India. Bhattacharjee et al., 2020 used the cumulative NCV confirmed cases; recovery cases and deaths for estimating recovery rate, caseload rate and death rate till 24 th April 2020. They predicted that by 20 th May 2020, the caseload rate would be lesser than the recovery rate and thereafter the number of COVID-19 patients would start reducing.
Rajesh et al., 2020 used the mathematical model SIR(D) to predict the future of the epidemic in India by using existing data. The model shows that the epidemic will be at its peak around the end of June or the first week of July with almost 100 million Indians infected if the lockdown is relaxed after 3 rd May 2020. However, the total size of the infected population will become one-third of this predicted number here, if people only in the red zones (approximately one-third of India's population) are susceptible to the infection. However, they expect that the numbers of infected people will at least be of the order of 10 million. Tomar and Gupta, 2020 used a data-driven model based on LSTM techniques. Their prediction of the number of positive and recovered cases (until April 2020) obtained by the model is accurate within a certain range.
However, these models could predict neither the number of cases accurately nor the date by which they would start decreasing. This is because the study of the spread is still in very early stages and there is no mathematical model that can predict how the spread would progress in India.

The Proposed Model
Since all the proven mathematical and statistical models were not able to predict the cases accurately due to the limitations of the models and the uncertainty in the behaviour of the spread of the disease, the authors tried a different approach. We relied on studying the spread of the disease in other countries, the trend, number of cases at the peak, total cases at the peak and population density as the factor responsible for the spread of the disease. We also took into consideration that humans tend to behave in an almost similar manner in all places. The data for the study was compiled on 24 th May 2020 and all the calculations were done taking this date as the baseline. Table-1 shows the data from those countries where the pandemic has peaked and shown considerable reduction in cases as of now. We have not taken the data from China since it was the first country where the pandemic started and hence, the people and government were caught unawares which led to the escalation of the number of cases.  1  Italy  777  3806  99  5  206  2  Germany  607  2159  83  6  240  3 Belgium 788 4950 160 21 383 * as on May 24 th , 2020 # source: worldometers.info The authors have taken two scenarios: one with the daily new cases per million at the peak and the other with the total number of cases per million at the peak.

Scenario 1: (Daily new cases):
It is observed from the above table that there is a correlation between the population density and the daily new cases per million. The ratio of population density to the daily new cases per million at the peak was found to be 2.08, 2.89 and 2.4 for Italy, Germany and Brazil respectively ( Table-2). Taking the average of these values as the ratio for India, the daily new cases would be around 190 per million, which is 5.1 at present. Average of all the countries studied and @ Calculated at the average ratio To arrive at the peak, the pattern of new cases per week in India was studied. Table-3 shows the average number of cases per million for every week since 3 rd March 2020. Although the first case in India was recorded in January, a major rise in new cases was noticed on March 4 th when 23 cases were recorded. However, the number of cases per week till March 24 th was negligible and has not been taken into account. Figure-1 shows the daily new cases for the period from 25 th March till 11 th June (the date of writing this paper) and the lockdown stages (different coloured bars). Since the rate of increase in cases is increasing every week (black vertical lines show the week), even during the lockdown, it was prudent to get the future weekly increase in the number of cases from the past weekly increase trend, for the study. Following the above trend, and given that the restrictions imposed earlier have been eased now, (starting from Lockdown 3 and now after June 1, almost all activities have been permitted), we have assumed that the number of cases per week shall increase at a rate of almost 1000 cases till it reaches the peak of 100 cases per million (best scenario) to 190 cases per million (worst scenario).
Under the best-case scenario, Fig-2 shows the predicted average number of cases in the coming weeks/months. The daily cases shall peak in the 2 nd week of October 2020 at around 130k daily cases, with total cases reaching approximately 8 million. The daily cases shall decrease to around 5k by last week of Feb 2021 and shall cease to exist by mid-May 2021. By this time, the cumulative cases shall reach to approximately 15million.

Figure-2: Predicted daily new cases for the peak at 100 cases per million
Extrapolating the same for 190 cases per million in the worst-case scenario, the peak in India shall be reached in mid-Dec 2020 and the total number of cases shall be 20 million (Fig-3). The increase in the daily average number of cases shall be around 260k and the cumulative cases will reach 40 million. By July 2021, the daily cases would reach around 5000 ( Figure-4) and shall cease to exist by the end of September 2021. In this scenario, we have taken the total number of cases per million at the peak, after which the pandemic spread started decreasing for Italy, Germany and Belgium. Table 4 shows the total cases per million for Italy, Germany and Belgium which varies between 600 and around 800. For India, we have taken the average of these three countries and assumed that the number of total cases per million shall reach 724 at the peak. At the same rate of 1000 new cases per week as assumed in scenario 1, the total cases shall reach around 1 million at the peak by mid-July 2020 when the daily new cases would be in the range of 34k and new cases per million would be around 25. Beyond mid-July 2020, the cases shall start decreasing following the pandemic bell curve and by mid-September, the daily new cases shall reduce to 5000 with cumulative cases reaching 2 million. This model predicts that the cases shall cease to exist by the end of November 2020 year and from December, no new cases shall be reported. The bell curve (Fig-5) and S-curve (Fig-6) for daily new cases and cumulative cases are shown below, respectively. However, this seems unlikely as the number of cases has started increasing significantly in the last one week after restrictions have been eased and almost all establishments have started functioning with domestic travel also allowed, by both rail and air.

Discussion
A COVID-19 prediction model for India has been presented in the paper by analysing the case trends in other countries and drawing out the similarities from them. The authors used the daily new cases per million and total cases per million as two parameters to predict the number of cases along with its relationship with population density. Population density and daily new cases are co-related to some extent and can be used as a model to predict the number of cases. The disease is contagious and can spread when an infected person comes in contact with one or more persons and is liable to infect some or all of them. It may be said that population density plays an important role in the spread of the disease. India has a similar population density to the other countries considered under the present study, where the number of cases has reduced significantly. The authors used the peak arrival stage for these countries and then incorporated the average values for India to prepare the model. Based upon population density and the trend in the number of daily cases in India, it was found that the total number of cases shall reach 20 million by the end of this year and shall take another 6 months to reduce significantly and 9 months before there are no new cases reported. However, by that time, the total number of affected people in India is predicted to reach 40 million. This model fits quite reasonably for the week when the paper was written with both the cumulative cases and also, with the rise in the daily number of cases.
There is no early decrease in the peak as is also evident from the testing data for India. Fig-9 shows the daily tests done and the number of positive cases reported. It is evident from the figure that the number of tests has almost stagnated at around 150k per day, but the daily reported cases are increasing resulting in narrowing the gap between samples tested and reported cases. There is a need for increasing the number of tests per day and when the gap starts widening then it can be concluded that the disease is at its decline which does not seem likely to happen shortly, at least. This also indicates that the model prediction holds valid. Fig-9: Samples tested vs the number of new cases reported (till 9 th June 2020)

Conclusion
There are several uncertainties in the spread of the disease because there is no set pattern of propagation among people. At times, contact tracing results in locating the source of infection but in many cases, this is difficult or impossible. However, models can be prepared by studying the trend in other countries in terms of when the case numbers peak and reduce significantly. Two scenarios have been presented by the authors and the predicted values have been compared with the actual data for 18 days. During the 18 days, it was found that the model predicts the actual number of daily new cases for 73% of the days and there was found a mere 2% average variation from the actual cumulative cases of COVID-19 in India.

Funding
The research did not receive any funding from any agency.