Analysing discussions around Rural Health on Twitter during the COVID-19 pandemic

Individuals from rural areas are increasingly using social media as a means of communication, receiving information, or actively complaining of inequalities and injustices. This study captured 57 days’ worth of Twitter data from June to August 2021 related to rural health. The study utilised social network analysis and natural language processing to analyse the data. It was found that Twitter served as a fruitful platform to raise awareness of problems faced by those living in rural areas. Overall, Twitter was utilised in rural areas to express complaints, to debate, and share information. Twitter could be leveraged as a powerful social listening tool for individuals and organisations who want to gain insight into public views around rural health.

tion by large urban hospitals and clinics in the United States of America [4]. Twitter has also been used as a new source of data to study depression and its wider determinants in the slum populations in India and Brazil and for predictive analytics and sentiment analysis [5].
A recent study of scientific literature analysing the implications of Twitter in health-related research identified a high diversity of themes ranging from professional education in healthcare, to big data, social marketing and substance use, physical and emotional well-being of young adults, and public health and health communication [6].
The analysis of social media provides a useful tool for public health specialists and government decision makers to gain insight into population reactions and feelings [7], especially in times of uncertainty like the one we are facing with the present pandemic [8].
A study by Cuomo et al. analysed the geospatial distribution of Tweets related to COVID-19 to try to illustrate the full scope of the pandemic. The authors found that rural areas in the United States of America engaged in COVID-19 social media conversations at later stages compared with urban areas [9].
The place of birth has been regarded as an important determinant of health [10]. The availability of resources in rural areas differs from urban areas and this has an impact on population health [11,12]. Another problem of rural areas is the shortage of health professionals willing to work in these areas [13]. Some initiatives are being developed to promote interest in rural health in this context. One of such initiatives uses social media for this objective. This is the case of the Rural Family Medicine Cafés, which since 2015 has been organising regular meetings using social networks to put in contact health professionals who work or have an interest in rural health [14,15].
There are few studies investigating the use of Twitter in relation to rural health issues and trying to analyse the more common topics covered in these areas. This is particularly interesting at the time of the COVID-19 pandemic. The main overall research aim of our study is to analyse the conversations about rural health taking place on Twitter during the COVID-19 pandemic to better understand the use of this social media tool in rural settings. More specifically, the objectives of this study are to: • Develop an understanding of the content and debates being shared on Twitter.
• To identity influential users around rural health on Twitter.
• To uncover the key hashtags and websites being shared.

Sampling Tweets and Ethical Approval
This study made use of the Twitter Archiving Google Sheets (TAGS) tool to retrieve 15,586 tweets matching the "rural health" keyword. Tweets were retrieved from 10/06/2021 to 06/08/2021 covering 57 days. TAGS draws upon the Twitter Search Application Programming Interface (API) to retrieve tweets. The project received ethical approval from Newcastle University (Approval number: 2036/2020). Although it can be argued that tweets are in the public domain, the project was careful not to draw attention to individual users acting in a personal capacity. However, the users and key tweets reproduced in this study derive either from accounts and users in the public domain, social media influencers, health organisations, politicians, and academic journals.

Data Analysis
The software NodeXL (Social Media Research Foundation, California, CA, USA) was used to conduct a social network analysis of the data [16]. The network graph was laid out using the Harel-Koren Fast Multiscale layout algorithm that is integrated into No-deXL. Natural language processing was applied to the tweets in order to identify word-pair correlations on the clusters identified from the social network analysis. In order to identify influential users the metric of betweenness centrality was applied which is derived from network theory and is used in this paper as a way of finding Twitter users that have influence in our dataset. This methodology has been used by the authors in previous research [17][18][19].
Patient consent was waived because our study involved the analysis of publically available Twitter data and our study only highlighted influential users that are in the public domain, and no patients were included in the study. No private and/or personal non-public information was utilised. Ethical approval was gained from Newcastle University (reference 2036/2020). There are also a number of other smaller groups and broadcast networks giving the overall network a community shape. Appendix 1 contains a full list of keywords associ-ated with each of the clusters, which provide insight into the types of topics that were being discussed.

Results of Social Network Analysis
Aside from 'rural, health' itself, the most popular co-word combination in group 1 was that of 'fighting, covid' (n=873). Other interesting keywords identified within this group included, 'busting, myths' (n=873) indicating the combatting of misinformation. The tweet ranked number 3 in regard to attracting the most retweets (in Table 4) may account for some of these keywords. In group 2, interesting word-combinations included 'health, systems' (n=373), 'expanding, medicaid' (n=256), 'taxpayers, money' (n=254), and 'affordable, health' (n=212).  within the dataset. There appears to be a constant stream of Twitter activity with two large peaks taking place on the 18th of June and the 16 of July, respectively. Overall, there appears to be much more activity taking place during June 2021. These peaks relate to spikes in retweets due to the tweets contained in Table 4.        The second most used language is Tagalog, mixed with English in the main body of the Tweet (407 retweets, 7 %). The third most used language is Korean, but in this case used only as a hashtag, being the main text of the tweet in English (390 retweets, 6.9%). The fourth one is Hindi, being only used in one out of the top ten retweets in our dataset (311 retweets, 5.5%).

Discussion
Regarding language, our study identified that the Latin alphabet was the most widely used. The study also found that Devanagari was also used for the text body of the tweet, and the Korean alphabet is was used on occasion only for the hashtag part oof tweets (the main body of the tweet is written in Another limitation is that the Search API used can only retrieve data from public facing Twitter accounts and not from private accounts, however, the majority of accounts are set as public. Another limitation is that as our studied retrieved data using a very specific keyword (rural health), our data may have excluded tweets from users who tweeted without using our target keyword.
Assessing needs for those living in rural communities has traditionally been challenging. Several circumstances have been a constraint: language as a barrier, isolation, lack of registries, difficulties to carry out interviews, location of the households, and expenditure to perform studies. Twitter could prove to be a solution for these problems and could be used as a social listening tool to identify the concerns and needs of rural communities. Our study shows that Twitter can be effectively used at least in a couple of ways: as a means of communication in rural areas and as a source of information on rural health. Moreover, the information existing on Twitter, when filtered by geographical locations, may be of interest for stakeholders, healthcare workers, politicians, patients, and communities in general.
Twitter could also be used strategically for those living in rural areas to communicate with one-another, for sharing local updates and warning of disasters such as areas to avoid. It could also be used as a way of connecting for sharing resources and supplies. This could be facilitated through the use of domain-specific hashtags related to each area and widely advertised and popularized locally.

Conclusions
Twitter has been shown to be a powerful means of communication which is used in rural areas. Twitter is a capable tool to raise awareness of the problems existing in rural health. India is the country with the most Twitter related conversations on rural health. Twitter is used in rural areas to express complaints, to debate, to share information, to acknowledge somebody or something, and to create advertisements or politician's campaigns. Twitter could be leveraged as a powerful source of information for individuals and organisations working on rural health.