Unravelling the social network of COVID-19 in India from 30 January to 6 April 2020

Social network analysis is an essential means to uncover and examine infectious contact relations between individuals. This paper aims to investigate the spread of coronavirus disease (COVID-19) from international to the national level and find a few super spreaders which played a central role in the transmission of disease in India. Our network metrics calculated from 30 January to 6 April 2020 revealed that the maximum numbers of connections were established from Dubai (degree-144) and UK (degree-64). These two countries played a crucial role in diffusing the disease in Indian states. The eigenvector centrality of Dubai is found to be the highest, and this marked it the most influential node. However, based on the modularity class, we found that the different clusters were formed across Indian states which demonstrated the forming of a multi-layered social network structure. A significant increase in the confirmed cases was reported during the first lockdown 1.0 (22 March 2020) primarily attributed to a gathering in Delhi Religious Conference (DRC) known as Tabliqui Jamaat. As of 6 April 2020, the overall structure of the network has encompassed local transmission, and it was significantly seen in the states like Gujarat, Rajasthan, and Karnataka. An important conclusion drawn from the presented social network reveals that the COVID-19 spread till 6 April was mainly due to the local transmission across Indian states. The timely quarantine of infected cases in DRC has not led it to spread at the level of community transmission. Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 24 May 2020 doi:10.20944/preprints202005.0382.v1 © 2020 by the author(s). Distributed under a Creative Commons CC BY license. 2


Introduction
In December 2019, China reported several cases of unusual pneumonia in Wuhan province to the World Health Organization (WHO) country office 1  In India, the first case was reported on 30 January 2020 from Wuhan, China. In the absence of any cure, this disease could have been fatal for a vast country like India, affecting the 1.3 billion population. However, the infection rate of COVID-19 in India was reported to be 1.7, which is remarkably lower than the other affected countries 3 . This slow spread could be mainly due to timely country lockdown 1.0 which was commenced on 24 March 2020 for 21 days 4 . To control this pandemic Indian government enacted a range of social distancing strategies, such as city-wide lockdown, screening measures at train stations and airports, and isolation of suspected cases 5 . Furthermore, widespread vaccination for tuberculosis or resistance to malaria has helped India remain immune to the pandemic to some extent.
Perhaps, due to these reasons, the rate of infection in India was slow 6 . As of 10 April, India's five worst-hit states were Maharashtra, Delhi, Tamil Nadu, Rajasthan and Telangana and these were declared as hotspots in terms of the total number of COVID-19 infections 7 .
In the initial phase, the transmission of COVID-19 was mainly due to international travels. A large number of Indians and foreigners travelled from countries like UK, UAE, Italy, Wuhan, Dubai, USA, Saudi Arabia, Iran, Philippines, Thailand, and Indonesia to Indian states. Transmission of disease which spreads through contact among the people can increase the risk of an outbreak. But to understand how these diseases spread among the people remains a challenge. The explosion of devastating infections, such as SARS (2003), Ebola (2014)(2015), and Zika (2015-2016), have shown that the dynamics behind the spread of disease is more complex and limit our ability to predict and control epidemics. Therefore, contact patterns can be used to analyze the dynamics of the disease. A network can be inferred through statistical metrics like degree, modularity, centrality, etc., these are the essential factors which quantify a network [8][9] . Social networks generally represent the 3 connections in the form of a graph where individuals are nodes and lines connecting them are edges. Edges represent the strength of interaction and can either be undirected or bidirectional. In summary, social-network analysis (SNA) [10][11][12][13][14][15][16] provides methods to measure the social interactions in a population, which in turn can quantify the social structure of an occurrence. Measures of centrality (degree, strength, eigenvector centrality, and closeness) are typically the most directly relevant metrics to disease research because they measure vital aspects of an individual's connectivity or importance to overall social structure. Most real networks typically find parts in which the nodes are more highly connected than to the rest of the network. The sets of such nodes are usually called clusters, communities, cohesive groups, or modules. The community detection problem (CDP) is defined as the division of a graph into clusters or groups of nodes where each one includes a robust internal cohesion (i.e. densities of edges within a group), and a week external cohesion (i.e. outside the group).
Some well-known methods are documented in the literature that allows constructing such communities in the form of clusters known as modularity [17][18][19][20] . In the present work, the network has been created on Gephi software (version 0.9.2) which makes use of the Louvain method for community detection 21 .
The main objective of this paper is to provide social network behind the spread of COVID-19 in India; it will demonstrate the situation from the beginning and how it outbreaks the balloon in Indian states through cluster formation. The presented work will be an essential contribution as fewer studies are available on the COVID-19 transmission network as a whole.  groups across Indian states. Therefore, we can conclude from Table 1 (also seen in Figure 1) that travellers from the UK have played the central role in transmitting the disease in India,

Results and discussion
where Dubai has highest eigenvector centrality which means it was the most influential node.   Table 2 summarise the metrics which quantity the connections in Figure 3. It is seen from the table that a maximum number of infected cases (degree-385) had been traced in Tamil Nadu, Delhi (degree-301), followed by Andhra Pradesh (degree-138), Assam (degree-24), and Uttar Pradesh (11) from DRC. However, it is interesting to note that although the degree of connections is very high in these states due to DRC, they have formed very less number of clusters outside their community. For example, modularity class of Andhra Pradesh is one, Delhi two and Tamil Nadu three, and so on, which shows that the transmission due to DRC remained confined to few states.

Conclusions:
The network analysis of social contacts reveals that in the initial phase of transmission, large numbers of regional connections were established mainly from countries like Dubai and UK.
From the statistical metric, it is found that Dubai had a degree of 144, and its eigenvector centrality was highest. However, an interesting observation is that the modularity class (number of clusters) formed from the UK is seven. Therefore, we can conclude that the UK has played a central role in transmitting the disease in India, where Dubai has the highest eigenvector centrality, which means it was the most influential node. Further, we found that modularity class from the states like Tamil Nadu, Delhi, Andhra Pradesh who have attended the DRC is very low. Hence, it is likely that their role in the spread of the disease outside their community was less. On the other hand, it was found that Gujarat, Rajasthan, and Maharashtra played a significant role in the local transmission, whereas Karnataka in the inter-state transfer.
The conclusion stands as COVID-19 spread in India was mainly through local transmission until lockdown 1.0, and has not gone up to the level of community transmission 23 .

Data availability:
Data utilized in the present study is obtained from https://www.covid19india.org, which includes the patient number, the state they belong to, their travelling history and source. In this study, we have included total 1386 cases of infections, out of which 373 are international and 1013 are national contacts. The data is considered from 30 January to 6 April 2020 and the network has been created on Gephi software (version 0.9.2).