eHealth-engagement on Facebook during COVID-19: a netnographical data visualization

: Understanding social media networks and group interactions are crucial to the advance-ment of linguistic and cultural behaviour. This includes the manner in which people accessed advice on health, especially during the global lockdown periods. Some people turned to social media to access information on health where most activities were curtailed with isolation rules, especially for older generations. Facebook public pages, groups and verified profiles, using "senior citizen health", "older generations", and "healthy living" keywords were analysed over a 12-month period to analyse the engagement promoting good mental health. CrowdTangle was used to source English language status updates, photo and video sharing information which resulted in an initial 116,321 posts and 6,462,065 interactions Data analysis and visualisation were used to explore large datasets, including natural language processing for “Message” content discovery, word frequency and correlational analysis and co-word clustering. Preliminary results indicate strong links to healthy aging information shared on social media which showed correlations to global daily confirmed case and daily death totals. The results can be used to identify public concerns early on and address mental health issues in the senior generation on Facebook.


Introduction
A fluid social media definition is a center of social groups (or community) networks that initiate conversations and enable relationships to form [1]. It can also take on a creative role or that of information sharing [1,2]. Freberg [3, p.772] claims that social media is used to "engage, reach, persuade and target specific audiences across multiple platforms". Social media can also be used to gain insight into how behaviours in communities happen at any given time [1] and traditional methods in research can "test and evaluate" [p.54] the substantial unstructured data on online. Freberg [1] also identifies growing expectations to make sense of online data. This study sought to understand how people engage in active aging activities through participation in Facebook groups, pages and verified profiles through reactions, shares and followers, etc.
Facebook, as the largest social media platform [1], and Instagram, were chosen as the social media platforms to investigate how senior users' engage with pages, groups and verified profiles on health during the COVID-19 lockdown periods from October 2020 to September 2021. Facebook claims to have 2,9 billion active users of which CrowdTangle tracks public content over 7 million Facebook pages, groups and verified profiles [4] and Instagram hosts over 2 million public accounts. Whilst there are other platforms, these two were chosen due to ease of accessibility. 80% of users between the ages of 15 and 34 rely entirely on social media for news and current affairs [5]. Facebook was designed for adults with enabling maximum network reach [6].
A 2017 study reported that 84% of Fortune 500 companies have a Facebook page [7]. Social media platforms are often used by groups for a specific community (such as "Healthy Living Group") whose main purpose is to educate. By using social media strategically, aiming content at a particular audience, much information can be disseminated as well as collected [1]. Audiences, relationships, personalities, content, actions and innovativeness became very important during the COVID-19 lockdown stages globally [1].
Posts, and especially posts regarding health, became of cardinal importance during lockdown. It was the only means of communication for some people during the global isolation stages. Online behaviour increased globally during the pandemic [8] and can be a valuable field of information.
Facebook reactions, include "likes", "wow", "love", "haha", all grouped as "Total Positive" reactions and "Sad" or "Angry" grouped as "Total Negative". It has always been a question of how much credibility one can assign to "likes" [7,9] and La France [9], claims that reactions were designed to "blunt-force emotional reaction", (p.19), by the click of a button. Lee et.al [10] feel that "likes" have the ability to drive traffic and serves a socialization component, but admit that the connotation can have variations. These reactions, including "sharing" behaviour pose social risks. Praet, et al., [11] feel that reactions on social media serves to be mere "echo-chambers" (p.3).
Netnography has also evolved during the pandemic. Historically it gives users a lens or an opportunity to address cultural, health or other burning issues and when applied to the common good, can bring about change [ 12 ]. Netnography studies the online interactions between individuals through internet connections or computer-mediated communications [ 13 ]. These communications bind the user with more than just transmission of information but also ties people through a "common interactional format, location or 'space', …or virtual 'cyberspace' (p. 16).
This manuscript investigated older generations' health seeking behaviour on Facebook with regards to active aging. The term "active aging" is considered to be a process of "optimizing opportunities for health, participation and security in order to enhance quality of life and well-being" [ 14 ]. Most countries have policies to encourage people to work as they grow older and to delay chronic illnesses that are costly to the state and healthcare systems. Participation includes active involvement in the labour force, be it socially, economically, cultural or spiritual. Spending time on social media has become very popular during this time where most people work from home and spend their time online.
Data visualization and storytelling were used to illustrate how groups and pages on Facebook were used to seek information. In many instances data is digitalized to further analyze data and trends [8,9]. Data visualization and effective communication thereof can turn insights into action [ 17 ]. Recently, digital storytelling gained popularity amongst researchers in the field on Humanities. Communicating information using storytelling forms a crucial instrument to effective communication [10].

Materials and Methods
Two strategies are employed when collecting online data: systematic collection via various platforms or third party software; and systematic organization of raw data analysis of organized data [1].This study employed third party software to collect social media updates on Facebook groups and pages.
CrowdTangle was used with search words "senior citizen health", "older generation" and healthy living", resulting in 6, 462,065 interactions with 116,321 posts on public Facebook pages and groups in the English language from 1 October 2020 to 30 September 2021. COVID-19 data was obtained from John Hopkins University Dashboard [ 18 ].
Facebook Groups searches resulted in 3,113 interactions from 294 posts and Facebook Pages resulted in 62, 758 interactions from 896 posts. Excel spreadsheets were prepared with social media and COVID 19 [15] data for the period 1 October 2020 to 30 September 2021. Columns in Facebook data, not essential to this study, such as identifiable information, as "User_Name"; "Facebook ID", "Page Description", "Date Page Created", "Post Created with Date and time", (only "Post Created Date" was retained), "Post created Time", "Video Share Status", "Is Video Owner?", "Post Views" , "Total Views", "Total Views for All Cross Posts", "Video Length", "URL", "Link", "Final Link", "Image Text", "Description", "Sponsor ID", "Sponsor Name", "Sponsor Category", were removed. OpenRefine 3.4.1 was used to remove trailing and leading white spaces and to change all data in the "Message" column to lower case and text for easier manipulation in Python 3. Data was further prepared by simplifying column names and removing duplicated columns, such as "Date" in both datasets. Data was filtered according to page categories and non-health categories were removed and cleaned of spam and advertisements, 27899 entries remained. All text in "Messages were transformed to lowercase, white spaces removed and transformed to "text" using OpenRefine. All "NaN" and missing values were removed.

Analyses:
The combined dataset was cleaned using OpenRefine and Python 3 used for content analysis, visualization and data analyses. Python 3 was used for data visualization in most cases, using Matplotlib, Seaborn, Plotly and TextBlob. Both cleaned datasets were imported to a workbook in Jupyter Notebook. Variables revealed no substantial violation of normality regarding distribution (skewness <1 and kurtosis, >3) and no outliers were identified.
Both datasets were described to indicate mean and standard deviation as well as distribution and correlation. Summaries of total COVID-19 infections, daily rates, cumulative deaths, daily deaths, likes and followers of postings were plotted to give visual overview of data. A histogram and probability were plotted against new cases. Page and Group administrators top 10 countries were determined, top 10 Page Categories and Types were differentiated. The top 10 pages with the most posts were also identified. Finally, the messages (which included status updates and titles of photos or videos) were analysed for sentiment polarity, message length, word count, top 10 bigram and trigram words were identified, excluding stopwords. Words were tagged and analysed, as well as bivariate analysis performed. Finally, topic modeling was performed and presented schematically using Latent Dirichlet Allocation (LDA) in GenSim. A t-distributed Stochastic Neighbor Embedding from Scikit-learn was used to visualize high-dimensional data that converts similarities between data points to joint probabilities Ethical mining of social media protocols was strictly adhered to. The study was registered with the University of Zululand Research Ethics Committee (UZREC 171110-030 Dept. 2021/1) and all identifying information has been removed from the raw dataset.

Descriptive analysis
Of the total data obtained from CrowdTangle [ 19 ], a public insights tool owned and operated by Facebook interactions, 696 unique posts were identified. Concatenated dataset comprising of Facebook pages and groups and WHO COVID-19 Daywise [ 18 ] numbers, resulted in a total of 23 430 rows and 28 columns of data.
Dataset 1 [11] contained 365 (from 1 October 2020 to 30 September 2021) rows of information in 6 columns, arranged by Confirmed COVID-19 cases, Confirmed Deaths, New Cases, News Deaths, and Number of Countries.

Table1: Descriptive Summary Analysis of COVID-19 Global Statistics
Dataset 2 [12] contained 28948 rows of information in 21 columns of clean data. Table 2 depicts the descriptive statistics of Dataset 2 based on Facebook Pages and Groups over a 12 months period on health and aging. It describes the number of likes and followers at postings, various reactions, such as likes, love and wow emoticons. The "Total Positive" include all the positive reactions, excluding "Sad" and "Angry", as these were grouped under "Total Negative".
The researchers attempted to see if a noticeable trend existed between "Followers at Posting", "Likes at Posting", the number of "Comments" made on posts, as well as the "Total Positive" and Total Negative" reactions. Based on Fig 2 below, there is little evidence of a relationship between these variables.        When creating pages and groups, administrators can assign the pages to specific categories (Fig. 7) in order for members or followers to find them. There were a total of 180 unique categories with "Gym" (6028) had the most posts, followed by "Media News Companies" (2747) and "Health Site" (2742) posting the second and third most posts in a 12-month period.   A total of 10, 769 unique page names occurred in a 12-month period, indicating that several pages posted more than once. The top 3 pages or groups posted 779, 545 and 261 times respectively. In most cases, the page name had the word "Health" in it as displayed

Message
The "messages" retrieved were analysed using Natural Language Processes in Python, using TextBlob and took 1.9 seconds to run . Messages were analysed for word, character, and sentence count as well as average word and sentence length and depicted in Table 2 below.  Messages were analysed for sentiment and polarity. Posts regarding aging and health were overwhelmingly in the mean positive sentiment at 0.27. Mean message length was at 650 characters with a minimum of 2 to maximum of 5 660. Posts with status updates and photos, videos and links with descriptions contained a minimum of 1 word, maximum of 1 038 with a mean of 98,12.  Word embedding or vectorising is an important tool to understand context of words in NLP [20]. The concept refers to the meaning of words in relation to their distribution in the text. In the drawing (Fig. 11), the word being investigated in represented by a red dot and relates to other words in how close they are in the grammatical environment. Representations of input words form an important part of NLP research [20]. In the example below, word "health" appeared in both the top Bigram and Trigram words. The word vector represents probability distributions of the word "health".

Discussion
Two datasets were investigated to identify any relation between social media health posts and the various COVID-19 new infections and or new deaths over a 12-month period. The obvious relation, as depicted in Graph 1 between daily new infections and daily new deaths have been identified in numerous research, corroborate this trend [22][23][24][25].
What is important to note, is that the amount of "Likes at Posting" (and even "Total Positive") is consistently higher than the amount of shares and one would assume that if Message Sentiment Polarity Polarity someone "liked" a post, they would want to disseminate that information. Weary of literature that refer to "echo-chambers" [11] and "emotional blunting [10], Fig. 2 indicates an increase in reactions to peaks and valleys of the infection rates, except in one instance, Aug 2021.
The positive trend-line in positive reactions can possibly be attributed to resilience strategies, by re-engaging in past activities that used to bring joy such as exercising. More emphasis was placed on the importance of physical health and by implication, mental health [24,25]. To promote healthy behaviour, Pages and Groups employed several strategies, such as 'getting out of your fans' way [ 28 ], sharing expertise through texts and visual posts. The health pages and groups in the sample, focused on positive posts to improve physical health ( Fig. 3 and Fig 10).
It is clear from Fig. 3 that over the aspect of a year, positive reactions increased against the decline in negative comments. Based on various studies, [10,11] this can be attributed to blunting-or desensitization effect. More realistically, it can be attributed to a more general concern for physical and mental well-being [24] and therefore an increasing awareness of healthy ways to live. Fig. 4 shows a rise in social engagement on Facebook Pages and Groups while Facebook reported a steady increase in users for the same period [ 29 ]. According to Dykes, [17], social media is used more and more to disseminate information and driving change [28]. In this 2021 study, Osuwu-Ansah et al., [ 30 ] also claims that groups and pages serve the "purpose of information-sharing, peer-tutoring, learning and finding friends" (p.7), but also states that social media competence can be a hindrance. Older generations might use Facebook to "learn" from only and "share" rather than comment or voice their opinion.
Most posts (Fig. 5) on aging and health were photographs and the second most posts were links. Photographs are one of the most popular features on Facebook. Personal photos can be shared, tagged and at times have their own comment sections, which allows for conversation. Often campaigns encourage followers that tag or share photos [ 28 ]. It is relatively easy to share than to "make up" new text posts and could be explained by the "echochamber" phenomenon [11]. It remains debatable to what extent followers engage with posts and is not in the scope of this research.
It comes as no surprise that 27% (Fig. 6) of the pages and groups that posted on health and aging, have their administrators originate from the United States of America. Northern America is purported to have the third largest population use of the Internet [ 31 ] of which the USA and China are the highest users of social media with a reported 71,5% Facebook subscribers in the USA [ 32 ].
The majority of categories that pages belong to institutions or companies that promote health. People who search for health information would be drawn to these categories and pages with "health" in the name.
Most page or groups names in Fig. 8 have the word or stems of the words "health", or "family" in the name, as depicted in the wordcloud. Names of groups or pages reflect the content therefore and will rank higher on a search item. It therefore stands to reason, that the pages or groups will have similar names.
Although no thematical analysis was conducted, a cursory categorical analysis revealed that pages in the Gym category ( Fig. 9) included the most posts on healthy aging or aimed at older members.
Healthy living or actions that would improve your health, in essence is a positive sentiment. This is corroborated by the majority posts (Table 2 and Fig. 10) with a positive sentiment.
As can be seen in Fig. 11, words such as welfare, services, patients, doctors, prevention, medicare, education, social, etc, are used either before or after the word "health" on the post mentions. Much emphasis is placed on "aids", "benefits", "prevention", "nutrition", "social" etc in the words used on Facebook.
There were several limitations to the study. It was not the aim of the researcher to prove credibility of information on the various social media sites, but rather to investigate how and to what extend information was disseminated. The "Types" of posts, such as "Photos" or "Videos" were not investigated and only "Statuses" were analysed and would be interesting in future studies. This opens opportunities to do further training in digital data analysis and more in-depth thematic analyses. The focus of the study was not on the influence of COVID-19 posts, but statuses that focused on health during the specific COVID period.
Specific thematic analysis was not in the scope of this research and therefore peaksand-valley incidents cannot be explained. Pathological use and rumination on health conditions were not investigated and does not form part of this study.

Conclusions
This study attempted to visualize data obtained from Facebook through CrowTangle. It tried to show links between status updates during the COVID-19 pandemic and health information seeking behaviour in older generations. It also looked at using simple coding to make this analysis and how it would present information. It appears as if positive information seeking behaviour increased during lockdown periods. This information can have an effect on how information is disseminated in future. Social media does not seem to be the sole playground of the young and the youthful.