Preprint
Article

Unmasking AI-Generated Fake News Across Multiple Domains

This version is not peer-reviewed.

Submitted:

09 May 2024

Posted:

13 May 2024

You are already at the latest version

Abstract
The spread of fake news poses a serious threat to public trust and informed decision-making. The emergence of AI-generated fake news exacerbates this issue, as it can be produced more rapidly and convincingly than human-generated content. Addressing this issue, our paper introduces a robust approach to detect AI-generated fake news that is applicable across major disinformation domains: Politics and Elections, Health and Medicine, Science and Technology, and Entertainment and Celebrities. Our models differentiate between AI-generated and human-generated news articles while assessing their truthfulness. Utilizing large language models such as ChatGPT, we curated datasets spanning both AI and human generated fake and genuine news. Following extensive data preprocessing and the application of diverse feature extraction techniques, our models were trained using multiple machine learning algorithms. Our results show the feasibility of accurately identifying AI-generated fake news, thereby adding a significant layer of defense in combating misinformation across various sectors. In addition, we have explored the potential utility of transferable machine learning models trained on data from one domain (e.g., politics), when tested on another domain (e.g., health).
Keywords: 
;  ;  ;  ;  ;  ;  ;  ;  

1. Introduction

The problem of fake news is not new; misinformation, disinformation, and propaganda have existed for as long as humans have communicated. However, the digital age –particularly the rise of social media and other rapid communication technologies – has magnified the impact and reach of fake news. AI-generated fake news adds another layer of complexity to this issue. The problem with AI-generated fake news lies in its ability to create highly convincing and sophisticated misinformation at an unprecedented scale and speed across multiple domains. Unlike human-generated fake news, which can sometimes be easier to detect due to inconsistencies or errors, AI-generated content can be nuanced, making it more difficult to detect.
Detecting AI-generated fake news is interesting and important due to a confluence of technological advancements and societal implications across various domains. Recent advances in large language models (LLMs) have made it possible to automate the creation of compelling fake news, adding a new layer of complexity to detecting fake news. The importance of tackling this issue cannot be overstated, as the rapid and wide dissemination of AI-generated misinformation can have detrimental societal impacts in various sectors.
Most earlier fake news detection solutions are largely tailored to detecting human-generated fake news [1]. These methods often focus on tell-tale signs of misinformation, which are characteristic of human errors and might not be applicable to sophisticated AI-generated content. The absence of comprehensive datasets featuring AI-generated fake news has been a significant hindrance. Without such datasets, it has been challenging to train models that can effectively (1) distinguish between human-generated and AI-generated fake news or (2) determine the veracity of the content. 
This work: Unmasking AI-generated Fake News across Domains This work introduces a novel method to detect AI-generated fake news, addressing prior research limitations. Our approach adapts to advanced AI-generated fake news techniques, enhancing traditional machine learning frameworks with sophisticated feature extraction to detect subtle AI-generated text nuances. We provide a balanced training environment for such ML methods by introducing first-of-its-kind datasets that include both human-written and AI-generated articles. 1 Our work directly addresses the issues of AI-generated fake news on the Web, contributing to the identification and detection of misleading information and emphasizing online content integrity and trustworthiness.

2. Challenges and State of Related Work

Misinformation, often interchangeably used with "disinformation", poses a significant challenge to today’s digital information ecosystem. Misinformation has the potential to sway public opinion, harm reputations, and even influence critical decisions. Recognizing the dire need to address disinformation, numerous researchers have embarked on efforts to detect, categorize, and counter misinformation [2,3,4,5,6,7]. In particular, much of the existing research emphasizes the detection of fake news [8,9,10,11] and is primarily focused on detecting human-generated misinformation. These techniques range from linguistic analysis to utilizing standard machine learning algorithms with feature-engineering. However, the emergence of AI-generated fake news, especially those generated by Large Language Models (LLMs), has presented a new and pressing challenge. LLMs, especially transformer-based ones with billions of parameters [12,13,14], signify a pivotal advancement in natural language processing. These expansive neural networks are pretrained on extensive text data, empowering them to produce coherent and contextually appropriate content that can sometimes rival the quality of human-written content. Some renowned open-source LLMs include T5 [15], Llama2 [16], CPM2 [17], Baichuan2 [18], OPT [19], ChatGLM [20], T0 [21], Falcon [22], Llama1 [23], and BLOOM [24]. On the other hand, proprietary LLMs encompass models such as Sparrow [25], WebGPT [26], ChatGPT-3.5 (gpt-3.5-turbo), GPT-4, PaLM [27], FLAN [28], AlexaTM [29], Anthropic [30], Chinchilla [31], GPT-3 [32], GLaM [33], and ERNIE [34].

2.1. Challenge: Domains of Fake News; a Multifaceted Challenge

The impact of fake news extends across diverse domains [35]. Each domain presents distinct challenges in terms of information verification and dissemination patterns. In Politics and Elections, misinformation can significantly influence public opinion and electoral outcomes, necessitating robust fact-checking mechanisms. Health-related misinformation can exploit public anxieties, making timely and accurate debunking crucial for public safety and trust in healthcare institutions. Scientific communities face the challenge of addressing misinformation while upholding principles of open discourse and accessibility to information. Entertainment and celebrity-focused fake news may be driven by sensationalism, requiring vigilant scrutiny to discern factual reporting from sensationalism.
Historically, there have been datasets on fake news in each of these domains, however, focusing primarily on human-generated fake news. Hence, we leveraged Language Learning Models (LLMs) such as ChatGPT-3.5 (gpt-3.5-turbo) to create both authentic and fictitious news across each domain; to the best of our knowledge, these are the first such datasets to be publicly available, enabling future research in detecting AI-generated misinformation more effectively.

2.2. Challenge: The Influence of Artificial Intelligence on Misinformation

In recent years, the intersection of artificial intelligence (AI) and fake news has introduced a new dimension to the challenge of misinformation. Advances in large language models (LLMs) have showcased the ability to automate the creation of persuasive fake news, elevating the sophistication of the content they produce. Such models, driven by vast neural networks, can craft text that closely mirrors human-written content. This makes distinguishing between genuine and AI-generated information exceedingly difficult. Historically, numerous studies have probed the dangers of misinformation produced by neural networks, suggesting that it is crafted by specific models [36,37,38,39,40,41,42,43,44], or focusing on the identification of such neural misinformation [45,46,47,48,49,50]. Recently, there have been some preliminary studies focused on misinformation generated by LLMs [51,52,53,54,55].
Differing from recent studies, we present a detection mechanism for AI-generated misinformation. Using a dataset with both AI-generated and human content, our approach performs consistently with various machine-learning models and across various domains. We conducted cross-domain testing to assess our method’s transferability. Our approach efficiently detects deceptive content, offering a new technique for pinpointing AI-generated fake news. This strategy provides robustness against evolving AI misinformation and strengthens defenses against misinformation in multiple sectors.

3. Detecting AI-Generated Fake News

Our approach integrates multiple techniques tailored to the distinct characteristics of both human-generated and AI-generated content. This strategy enables us to establish a comprehensive framework for precise identification and differentiation. The steps encompass data collection, among various other steps, and rigorous evaluation, as depicted in Figure 2.

3.1. Data Collection

We investigated content across four distinct and major domains of misinformation: Politics and Elections, Health and Medicine, Entertainment and Celebrities, and Science and Technology. For each domain, our data collection comprises two key categories: Human-Generated and AI-Generated content. Table 1 provides an overview of the datasets, distinguishing between the records for human-generated and AI-generated news articles both before and after sampling.

3.1.1. Human-Generated Dataset

First, a robust human-generated dataset had to be assembled. We sourced news articles from authoritative sources known for their credibility and accuracy; this ensured journalistic integrity.
a)
Politics and Elections: Leveraging the "Fake News Detection" dataset available on Kaggle, we filtered data pertaining to the politics domain. There are a total of 18113 records, and we isolated 2714 random samples each from True.csv and False.csv files respectively.
b)
Health and Medicine: We obtained data from two repositories on GitHub: "COVID-19 Rumor Dataset" [56], and the "FakeHealth Repository" [57]. From the former, we distilled 500 entries each of authentic and inauthentic news articles relating to COVID-19. The latter, i.e., the FakeHealth repository, bifurcates into two subsets: HealthStory and HealthRelease, distinguished by their originating sources. Guided by the evaluative methodology articulated in [57], we labeled entries with scores beneath "3" as fictitious, thereby securing 546 deceptive and 1053 credible articles. These diverse sources converged, culminating in a balanced dataset featuring 1625 instances each of true and false news pieces.
c)
Entertainment and Celebrities: The dataset for entertainment and celebrity news was gathered from the "Fake-News-Dataset"[58] on Kaggle, a resource containing two distinct fake news datasets across seven domains: technology, education, business, sports, politics, entertainment, and celebrities. We focused on the entertainment and celebrity sectors, each with 290 true and 290 false articles. To enhance our dataset for more robust analysis, oversampling techniques were applied, resulting in a comprehensive dataset with 1657 true records and 1657 false records.
d)
Science and Technology: In this step, we continued to utilize the "Fake-News-Dataset" [58] available on Kaggle, focusing our efforts on the technology domain. While the initial dataset presented a modest count of 40 true and 40 false articles, we leveraged oversampling techniques to amplify each group to 1527 samples. This strategy ensured a balanced dataset that aligns seamlessly with the volume of AI-generated data.

3.1.2. AI-Generated Dataset

To obtain both true and fake news articles, we employed large language models (LLMs) such as ChatGPT-3.5 (gpt-3.5-turbo). While the production of genuine news articles was straightforward, crafting fake news articles presented ethical dilemmas, as illustrated in Figure 1. The figure depicts a query to ChatGPT requesting a fake news article, to which it responds by stating that it cannot assist in generating fake news articles. Nevertheless, as outlined in Section 3.2, there are ways to bypass security measures and create fake news articles. This approach ensured the inclusion of AI-generated fake articles in our dataset, adhering to ethical guidelines and furthering our research without compromising moral integrity. We intend to release our datasets to the public domain, allowing fellow researchers to build upon our foundation and further the efforts in combating AI-generated fake news. Table A1 presents a domain-specific list of subjects for which both true and fake news have been generated using ChatGPT-3.5 Model.
a)
Politics and Elections: To create fake news in the realm of politics, we assembled a comprehensive list of high-profile politicians to serve as a foundation. A few examples from this list include: "Joe Biden", "Kamala Harris", "Donald Trump", "Bernie Sanders", and "Nancy Pelosi".
b)
Health and Medicine: Similarly, we derived a collection of pertinent health-related topics to base our fake news articles on. Here are some of the focal points we chose: "COVID-19", "Cancer", "HIV/AIDS", "Diabetes", and "Obesity".
c)
Entertainment and Celebrities: Following the same strategy, we pinpointed renowned celebrities to generate both true and fabricated news articles revolving around these personalities. A few names we included are: "Taylor Swift", "Cristiano Ronaldo", "Kylie Jenner", "Kim Kardashian", and "Lionel Messi".
d)
Science and Technology: We spotlighted critical subjects prevalent in the science and technology sectors, utilizing them as the groundwork for crafting articles in this field. Some examples are "Cybersecurity", "5G Networks", "Cryptocurrency", "Blockchain", "Artificial Intelligence" and "Machine Learning".
Figure 2. Flowchart illustrating the methodology for unmasking AI-generated fake news across multiple domains
Figure 2. Flowchart illustrating the methodology for unmasking AI-generated fake news across multiple domains
Preprints 106112 g002

3.2. Fake News Creation

For each subject within a specified domain, we employ the payload outlined in Listing . This payload is used as input for the Chat Completions API of the ChatGPT-3.5 (gpt-3.5-turbo) model. In return, we receive a well-crafted fake news article, featuring a Title and Body, as outlined in Appendix A.
Preprints 106112 i001

3.3. Data Preprocessing

Given the noisy and incomplete nature of our initial dataset, strategic preprocessing was necessary. We carried out many standard preprocessing steps to ensure our dataset was primed for robust analysis and modeling.
1)
Tokenization: This step involves the typical tokenization, breaking the text into individual words or terms, known as "tokens," enabling a detailed analysis of the text.
2)
URL and Non-Alphanumeric Character Removal: We eliminated irrelevant elements such as non-alphanumeric characters and URLs, refining the dataset to focus on meaningful text and to reduce noise.
3)
Case Normalization: We converted all text to lowercase, ensuring consistency and enabling seamless text analysis.
4)
Stop Word Removal: We eliminated common words with little semantic value, streamlining the text to emphasize words with significant meaning.
5)
Lemmatization: We reduced words to their base or root form, creating a cleaner and more streamlined dataset.

3.4. Word Vectorization

We perform typical word vectorizations for our datasets. Our goal is to help ML methods discern the linguistic patterns typically used by the AI to generate fake news, helping differentiate between genuine and fake content.
In particular, we use a (1) Bag of Words (BoW) Model, i.e., creating a vocabulary of unique words in the dataset and recording their presence (or absence) in each document, and (2) a TF-IDF Vectorization, that is, quantifying the importance of words in a document based on their frequency within the document (Term Frequency) and rarity across the dataset (Inverse Document Frequency).

3.5. Learning Models

For classifying AI-generated fake news, we employ a diverse ensemble of machine learning models. Different ML models have different learning biases, and this strategy ensures feasibility under different learning biases. We tested and evaluated the efficacy of five distinct Machine Learning models: Support Vector Machine (SVM), Naive Bayes (NB), Logistic Regression (LR), Decision Trees (DT), and Random Forest (RF).

3.5.1. Model Selection and Domain-specific Testing:

The selection of these models was driven by their prominence and versatility in the field of text classification, making them well-suited for the task at hand. To ensure a comprehensive assessment of our model’s performance, we conducted separate evaluations for each domain under investigation. The domains encompassed a diverse range of textual data, each posing unique challenges and nuances in the detection of AI-generated fake news.

3.5.2. Independent and Dependent Variables:

The `Title’ and `Body’ columns were designated as the features, serving as the textual underpinnings upon which our model’s predictions were grounded. These components encapsulate the essence of news articles and are pivotal in discerning fraudulent content from authentic news reports. Simultaneously, we identified `Generated_By’ and `Tag’ as the labels, representing the core attributes we aimed to predict and classify.

3.5.3. Binary Classification: `Generated_By’ and `Tag’:

Our research involved the development of two distinct binary classifiers, each tailored to address specific facets of AI-generated fake news detection:
1)
Classifier for `Generated_By’: This classifier was engineered to predict whether a given news article was generated by AI or authored by a human. The ability to differentiate between AI-generated and human-generated content is a fundamental component in unmasking fake news sources.
2)
Classifier for `Tag’: The second classifier was designed to ascertain the veracity of news articles, classifying them as either `True’ or `False.’ This classification aids in determining the authenticity of news content, a pivotal aspect in identifying misinformation.

3.5.4. Evaluation:

We employed 10-fold cross-validation. k-fold cross-validation was performed for 10 rounds, ensuring each subset served as the test set once.

4. Results and Discussion

Across all domains investigated, the employed machine learning algorithms consistently demonstrated high accuracy rates, indicating the choice of learning algorithm to be not as important in the detection process. We observed a particularly notable performance in predicting the `Generated_By’ class, exceeding 95%. Additionally, the models displayed strong proficiency in predicting the `Tag’ class, achieving accuracy rates surpassing 85%.
It is important to emphasize that all the machine learning models employed yielded commendable results, affirming the robustness of our straightforward approach.
In pursuit of a more comprehensive and representative dataset, we implemented a series of data augmentation techniques and transformations. This process involves enhancing and diversifying the dataset by applying various techniques to the existing data, aimed at improving the robustness of our machine learning models. These techniques were carefully designed to expose our models to a wider spectrum of linguistic variations and structural diversity.

4.1. Paraphrasing AI-Generated News Articles: Enhancing Linguistic Diversity

Acknowledging the importance of dataset diversity, we employed a strategy to enrich the textual content. About 50% of the AI-generated articles were paraphrased using the ChatGPT-3.5 (gpt-3.5-turbo) model. This introduced variations in wording, enhancing the linguistic diversity of our dataset.
In emulating the editorial process carried out by human editors in traditional newsrooms, we engaged ChatGPT-3.5 (gpt-3.5-turbo) model once more. This process aligns with the industry standard wherein news articles often undergo rewriting or paraphrasing before publication. By incorporating this step, we diversify the linguistic patterns within the dataset, and also enhance its authenticity, mirroring the practices employed by human editors (malicious or otherwise) in the finalization of news articles.

4.2. Promoting Model Robustness: Random Sentence Removal

The removal of a random number of sentences from AI-generated news articles served as an additional method to introduce variability into our dataset. This strategic intervention was motivated by the goal of diversifying the dataset beyond structural adjustments. By randomly selecting and eliminating sentences, we intentionally disrupted the linear narrative of the articles.
This process aimed to test our models on fragmented information, mirroring real-world scenarios where fake news appears in disjointed segments. By introducing this information scarcity, we sought to boost our models’ adaptability, equipping them to distinguish genuine content from manipulated narratives in practical contexts.

4.3. Enhancing Structural Variability: Random Sentence Shuffling

To further perturb the structure and ordering of the AI-generated text, we applied random shuffling. This technique involved the random reordering of sentences within each news article. The objective was to disrupt the original flow of the content, fostering an environment where the models needed to rely on semantic cues rather than sequential structure.
By introducing structural variability, we aimed to prompt our models to understand sentences beyond their original sequence. This strategy fostered semantic understanding and adaptability for non-linear comprehension. Given that real-world misinformation can be disordered, understanding individual sentences without relying on sequence is vital. Our random sentence shuffling aimed to simulate this challenge, strengthening our models against varied misinformation structures.

4.4. Impact on Model Performance: Analyzing Model Robustness in Diverse Data Environments

The transformative measures we implemented resulted in a slight accuracy decrease of 0% to atmost 2%. Yet, these steps enhance the model’s defense against varied misinformation forms. Our machine learning model maintained high accuracy despite these dataset changes, underlining its robustness. Its consistent performance in identifying AI-generated fake news, even with diversified data, underscores the strength of our approach in addressing real-world misinformation challenges.
The effectiveness of our approach in unmasking AI-generated fake news is demonstrated through domain-specific accuracy measures. Figure 3 and Figure 4 display the accuracy scores of the "Generated_By" and "Tag" classifiers respectively, utilizing TF-IDF vectorization, before data augmentation. Figure 5 and Figure 6 present the accuracy scores of the "Generated_By" and "Tag" classifiers respectively, utilizing TF-IDF vectorization, after data augmentation.
Detailed accuracies of Generated_By and Tag classifiers for each domain, using each vectorization technique before and after applying data augmentation, are shown in Table 2 to Table 5. The results demonstrate exceptional performance across all domains and classifiers. The consistently high accuracy rates confirm the efficacy of our approach in effectively discerning AI-generated fake news.
Table 2. Detailed Accuracy Comparison for Generated_By Classifier with TF-IDF Vectorization before and after Data Augmentation
Table 2. Detailed Accuracy Comparison for Generated_By Classifier with TF-IDF Vectorization before and after Data Augmentation
Domain Generated_By
Before Data Augmentation After Data Augmentation
SVM NB LR DT RF SVM NB LR DT RF
Politics and Elections 99.97 99.86 99.84 95.93 99.88 99.82 99.77 99.66 95.01 99.68
Health and Medicine 100 99.09 99.80 98.82 99.86 99.86 99.28 99.54 98.08 99.69
Entertainment and Celebrities 100 99.91 99.92 99.64 99.98 99.98 99.83 99.95 97.86 100
Science and Technology 100 100 100 99.72 100 100 99.97 99.98 99.39 100
Table 3. Detailed Accuracy Comparison for Tag Classifier with TF-IDF Vectorization before and after Data Augmentation
Table 3. Detailed Accuracy Comparison for Tag Classifier with TF-IDF Vectorization before and after Data Augmentation
Domain Tag
Before Data Augmentation After Data Augmentation
SVM NB LR DT RF SVM NB LR DT RF
Politics and Elections 97.90 92.74 96.47 90.45 96.69 96.79 91.84 95.42 87.46 94.68
Health and Medicine 94.68 86.97 90.32 89.34 94.28 92.83 85.78 88.54 86.82 92.48
Entertainment and Celebrities 98.60 93.92 96.73 92.49 97.86 95.99 92.09 94.51 87.87 95.14
Science and Technology 99.39 97.82 98.87 92.17 98.97 97.94 95.78 97.41 90.93 96.91
Table 4. Detailed Accuracy Comparison for Generated_By Classifier with BoW Model before and after Data Augmentation
Table 4. Detailed Accuracy Comparison for Generated_By Classifier with BoW Model before and after Data Augmentation
Domain Generated_By
Before Data Augmentation After Data Augmentation
SVM NB LR DT RF SVM NB LR DT RF
Politics and Elections 99.82 99.86 99.86 95.62 99.86 99.23 99.67 99.48 94.57 99.50
Health and Medicine 99.98 98.8 99.98 98.54 100 98.92 98.43 99.34 96.78 99.58
Entertainment and Celebrities 99.98 99.97 99.97 99.53 99.98 99.95 99.94 99.97 98.13 100
Science and Technology 100 100 100 99.8 100 100 99.98 99.98 99.30 100
Table 5. Detailed Accuracy Comparison for Tag Classifier with BoW Model before and after Data Augmentation
Table 5. Detailed Accuracy Comparison for Tag Classifier with BoW Model before and after Data Augmentation
Domain Tag
Before Data Augmentation After Data Augmentation
SVM NB LR DT RF SVM NB LR DT RF
Politics and Elections 97.47 93.69 97.68 90.85 96.42 95.13 92.85 96.04 87.90 94.21
Health and Medicine 92.29 86.86 93.8 89.74 94.31 90.65 84.63 91.69 87.08 92.42
Entertainment and Celebrities 98.07 94.04 98.57 93.65 97.95 95.26 91.26 95.85 88.93 95.19
Science and Technology 99.08 98.4 99.07 93.11 99.08 96.99 96.66 97.33 91.36 96.91

4.5. Cross-Domain Evaluation: Testing Across Different Data Realms

We tested the models across various data realms to gauge their versatility and robustness. By exposing algorithms to different domains than those they were initially trained on, we can better understand their adaptability and generalization capabilities under various types of AI-generated fake news.

4.5.1. Significance of Domain Similarity

Our results show that the performances of machine learning models greatly depend on the similarity between the training and testing domains. Our models often achieve better accuracy when tested on domains that have overlapping or related content to their training domain. This highlights the importance of aligning the training data with the intended testing domain.

4.5.2. In-Domain Testing Outperforms Cross-Domain

Clearly, the highest accuracies were achieved when the models were tested on the same domain they were trained on. Overall, the machine learning models demonstrate strong performance in identifying AI-generated fake news within their respective domains. However, there is a drop in accuracy when the models are tested on domains different from the one they were trained on. This suggests the importance of considering domain-specific characteristics in the training process.
Our findings, as depicted in Table 6, Table 7, Table 8, and Table 9, showcase the accuracies of models trained on specific domains and subsequently tested on other domains, encompassing both the Generated_By and Tag Classifiers.

5. Conclusions and Future Work

This study aimed to address the issue of AI-generated fake news across multiple domains. We gathered and processed human and AI-generated articles, ensuring dataset quality and balance through systematic preprocessing techniques. We developed effective classifiers to identify (1) whether news was generated by AI or humans and (2) whether it was true or false. These classifiers achieved over 95% accuracy in detecting AI-generated fake news and over 85% accuracy in determining article authenticity.
To combat the evolving challenge of misinformation, we incorporated data augmentation techniques, including paraphrasing and sentence adjustments. Introducing added complexity only resulted in a slight reduction in accuracy; our models exhibited remarkable resilience and adaptability.
This study also included cross-domain testing, involving training on one domain and testing on others. Our findings emphasize the crucial role of domain similarity in model performance. While models excelled in their original domains, accuracy dropped in different ones. This underscores the criticality of accounting for domain-specific characteristics during model training.
Our work opens avenues for further exploration in AI-generated fake news detection. This includes extending the methodology to other critical domains for misinformation like finance. Exploring multimodal data analysis, including text, images, and videos, is another promising direction. It is essential to develop models that cross-reference varied media and prioritize real-time detection systems. These systems, especially for monitoring news sources and social media, are key to promptly identifying and halting the spread of misinformation.

Author Contributions

The authors collaborated and made equal contributions to the editorial process of this special issue.

Funding

This research received no external funding.

Data Availability Statement

The data from this study are available upon request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
LLM Large Language Models
AI Artificial Intelligence
GPT Generative pre-trained transformer
SVM Support Vector Machine
NB Naive Bayes
LR Logistic Regression
DT Decision Trees
RF Random Forest
BoW Bag of Words

Appendix A

Below are a few AI-generated news articles, both fake and authentic, spanning various domains, created using the ChatGPT-3.5 (gpt-3.5-turbo) model. Table 10 displays a domain-specific list of subjects for which both true and fake news have been generated using the same model.
Table A1. A domain-specific list of subjects for which True/Fake news is generated using ChatGPT-3.5 (gpt-3.5-turbo) Model
Table A1. A domain-specific list of subjects for which True/Fake news is generated using ChatGPT-3.5 (gpt-3.5-turbo) Model
Domain List of subjects for which True/Fake news is generated using ChatGPT-3.5
(gpt-3.5-turbo) Model
Politics and Elections Joe Biden, Kamala Harris, Donald Trump, Bernie Sanders, Nancy Pelosi,
Mitch McConnell, Alexandria Ocasio-Cortez, Ted Cruz, Elizabeth Warren,
Chuck Schumer, Kevin McCarthy, Lindsey Graham, Cory Booker,
Mitt Romney, Andrew Cuomo, Nikki Haley, Pete Buttigieg,
Amy Klobuchar, Ron DeSantis, Marco Rubio, Stacey Abrams,
Bill de Blasio, Rand Paul, Tom Cotton, Kirsten Gillibrand,
John Kasich, Greg Abbott, Jay Inslee, Mike Pence,
Ted Lieu, Jerry Brown, Tim Scott, Steve Scalise,
Anthony Fauci, Dan Crenshaw, Adam Schiff, Marjorie Taylor Greene,
Ilhan Omar, Pete Sessions, Maxine Waters, Steve Bannon,
Sarah Palin, Devin Nunes, Jared Kushner, Ben Carson,
Alexandria Ocasio-Cortez, Lisa Murkowski, Tom Wolf, Ron Johnson,
Bill Lee, Doug Ducey, Chris Christie, Janet Yellen,
Ben Sasse, Tim Walz, John Hickenlooper, Kay Ivey, Michael Bloomberg,
Tulsi Gabbard, Eric Garcetti, Martin O’Malley, John Bel Edwards,
Ralph Northam, Julian Castro, Robert Reich, Kristi Noem, Tim Ryan,
Andrew Yang, Jeff Merkley, Jared Polis, Richard Blumenthal, Roy Cooper,
Gavin Newsom, Steve Bullock, Mike Pompeo, Kelly Loeffler,
Mike DeWine, Marsha Blackburn, John Cornyn, Ron DeSantis, Pete Ricketts,
Val Demings, Andy Beshear, Dianne Feinstein, Tammy Duckworth,
Jay Nixon, Steve Mnuchin, John Thune, Pete Snyder, Joni Ernst,
Mark Kelly, Amy McGrath, Sam Brownback, Jan Brewer, Gary Peters,
Gretchen Whitmer.
Health and Medicine COVID-19, Cancer, HIV/AIDS, Diabetes, Obesity,
Vaccines and Immunization, Mental Health Disorders,
Diet and Nutrition, Addiction and Substance Abuse, Alzheimer’s Disease,
Environmental Health, Zika, Ebola, Chronic Diseases.
Entertainment and Celebrities Taylor Swift, Cristiano Ronaldo, Kylie Jenner, Kim Kardashian,
Lionel Messi, LeBron James, Dwayne The Rock Johnson,
Selena Gomez, Beyonce, Kevin Hart, Ariana Grande, Justin Bieber,
Ellen DeGeneres, Lady Gaga, Oprah Winfrey, Drake, Virat Kohli,
Rihanna, Scarlett Johansson, Jennifer Lopez, Roger Federer, Billie Eilish,
Keanu Reeves, Robert Downey Jr., Elton John, Naomi Osaka, Serena Williams,
Shah Rukh Khan, Priyanka Chopra Jonas, Chris Hemsworth, Angelina Jolie,
Tom Cruise, BTS, Katy Perry, Vin Diesel, Bradley Cooper, Pink, Adele,
Kendrick Lamar, Travis Scott, Steph Curry, Akshay Kumar, JK Rowling,
Ryan Reynolds, Paul McCartney, Bruno Mars, Howard Stern, Russell Wilson,
Adam Sandler, Will Smith, Novak Djokovic, Tiger Woods, Gordon Ramsay,
Sean Diddy Combs, Gal Gadot, Conor McGregor, Simon Cowell,
Kanye West, Shahid Kapoor, Gigi Hadid, Nicki Minaj,
Jackie Chan, Marshmello, Jimmy Fallon, Deepika Padukone, Lewis Hamilton,
Hugh Jackman, Celine Dion, Zac Efron, Sofia Vergara, Mark Wahlberg,
James Charles, Miley Cyrus, Rowan Atkinson, Nicole Kidman, Rafael Nadal,
Canelo Alvarez, Phil McGraw, Ed Sheeran, Charlize Theron, Ellen Pompeo,
Margot Robbie, Hrithik Roshan, Mahendra Singh Dhoni, Emma Watson,
Shawn Mendes, Julia Roberts, Kevin Durant, Saoirse Ronan, David Beckham,
Kate Middleton, Zendaya, Anthony Joshua, David Guetta,
Ranveer Singh, Timothee Chalamet,
Giannis Antetokounmpo, Cardi B, Jared Leto.
Science and Technology Cybersecurity, 5G Networks, Cryptocurrency, Blockchain,
Artificial Intelligence and Machine Learning, Internet of Things,
Virtual and Augmented Reality, Autonomous Vehicles,
Social Media Algorithms, Deepfakes and Synthetic Media, Quantum Computing,
Chemistry, Evolution, Stem Cell Research, Space Exploration and Astronomy,
Climate Change, Alternative Medicine, Nutritional Science, Human Reproduction,
Evolutionary Biology, Energy Production, Neuroscience.

Appendix A.1. Fake News Article Crafted in the Politics and Elections Domain Using ChatGPT-3.5 (gpt-3.5-Turbo) Model

PROMPT: You are a news article generator who generates fake news. Compose a fake news article in the politics and elections domain about Chris Christie. The headline should be provided under `Title’ and the main content should appear under `Body’.
RESPONSE:
Title: Chris Christie Appointed as Chief Advisor to President-elect Joe Biden.
Body: In a surprising twist of events, former New Jersey Governor Chris Christie has been chosen by President-elect Joe Biden to serve as his Chief Advisor. The decision, which has sent shockwaves through the political landscape, marks an unexpected alliance between two individuals who have been known for their sharp ideological differences.
Sources close to the Biden transition team indicate that Christie’s extensive experience in governance and his reputation as a pragmatic and effective leader played a significant role in his appointment. Despite being a Republican, Christie has shown willingness to work across party lines in the past, making him an attractive candidate for the incoming administration.
Observers speculate that Biden’s selection of Christie could be a strategic move aimed at bridging the divide between Democrats and Republicans. With Christie’s track record of bipartisanship and his close ties to prominent Republicans, including his friendship with former President Donald Trump, it’s believed that his presence in the Biden administration could help to ease tensions and foster a more cooperative atmosphere in Washington.
Critics of the appointment argue that Christie’s conservative policies and his controversial handling of the "Bridgegate" scandal could undermine the trust and credibility of the Biden administration. However, supporters counter that the decision to bring Christie onboard demonstrates Biden’s commitment to unity and his dedication to reaching across the aisle for the betterment of the country.
The announcement has already sparked mixed reactions from both sides of the political spectrum. Some progressives view this move as a betrayal of Democratic values, while others see it as a necessary step towards healing the deep divisions that currently plague American politics.
As Christie assumes his new role, all eyes will be on how he navigates the challenges ahead and whether he can successfully bridge the gap between the Biden administration and the Republican Party. Only time will tell if this unexpected pairing will prove to be a stroke of political genius or a risky gamble on the part of President-elect Biden.

Appendix A.2. Authentic News Article Crafted in the Politics and Elections Domain Using ChatGPT-3.5 (gpt-3.5-Turbo) Model

PROMPT: You are a news article generator who generates true news. Compose a true news article in the politics and elections domain about Kamala Harris. The headline should be provided under `Title’ and the main content should appear under `Body’.
RESPONSE:
Title: Kamala Harris Appointed as the United States’ First Female Vice President.
Body: In a historic moment for the United States, Kamala Harris has been sworn in as the country’s first female vice president. On January 20, 2021, Joe Biden was inaugurated as the 46th president of the United States, and Kamala Harris took the oath of office, becoming the highest-ranking female official in U.S. history.
Harris, a former senator from California, shattered barriers and made history as the first woman, the first African American woman, and the first South Asian woman to be elected as vice president. Her groundbreaking achievement not only showcases the progress made in terms of gender and racial equality but also serves as an inspiration for women and girls across the nation.
Throughout her political career, Harris has focused on addressing issues related to criminal justice reform, immigration, healthcare, and climate change. Her experience as California’s attorney general and a senator has shaped her ideas and policies, making her a prominent voice within the Democratic Party.
As vice president, Harris is expected to play a crucial role in President Biden’s administration, supporting and advising him on a wide range of issues. Her diverse background and perspectives are seen as an asset, representing a step forward towards a more inclusive and representative government.
The appointment of Kamala Harris as vice president reflects the growing recognition of women’s ability to lead and contribute to the highest levels of political power. As she occupies this historic role, Harris has the opportunity to make a significant impact on the nation’s direction and serve as a role model for future generations of leaders.
With her inauguration, Harris has broken yet another glass ceiling, providing hope and inspiration to millions of Americans who had felt underrepresented in the nation’s highest offices. Her journey serves as a reminder that equality and progress are achievable milestones, and that diversity in leadership is vital for a vibrant democracy.
As the United States embraces its first female vice president, the world watches, recognizing the significance of this monumental step. The impact of Kamala Harris’s election stretches far beyond the borders of the United States, highlighting the growing global momentum toward gender equality and empowering women in leadership positions.

Appendix A.3. Fake News Article Crafted in the Health and Medicine Domain Using ChatGPT-3.5 (gpt-3.5-Turbo) Model

PROMPT: You are a news article generator who generates fake news. Compose a fake news article in the health and medicine domain about HIV/AIDS. The headline should be provided under `Title’ and the main content should appear under `Body’.
RESPONSE:
Title: Revolutionary Breakthrough: Scientists Develop an HIV/AIDS Cure, Providing Hope for Millions.
Body: In a groundbreaking discovery, a team of brilliant scientists has purportedly developed a definitive cure for HIV/AIDS, sparking hope among millions of people living with this devastating disease. The breakthrough was achieved through an innovative approach using nano-technology, challenging years of research and scientific skepticism.
Lead researcher, Dr. Samantha Evans, revealed that the team has successfully conducted extensive clinical trials on a group of patients who were previously found to be HIV positive. According to preliminary findings, all participants displayed an elimination of the virus from their bloodstream within just six weeks of treatment.
The revolutionary therapy involves the administration of specially designed nanobots which directly target the HIV virus and dismantle its genetic makeup. These microscopic robots, infused with powerful antiviral agents, work tirelessly to neutralize the virus, allowing the patient’s immune system to fully recover.
Dr. Evans explains, "Our nanobots effectively hunt down and destroy the virus at its source, leaving no trace behind. This approach not only eliminates the need for continued antiretroviral medication, but also prevents the virus from mutating and developing resistant strains, which has perplexed researchers for years."
The scientific community has expressed both excitement and cautious optimism about the potential of this groundbreaking treatment. Patient advocacy groups are celebrating the news as a major breakthrough that could change the lives of millions affected by HIV/AIDS worldwide.
If these experimental findings are substantiated in further trials, it could potentially mark the beginning of a new era in HIV/AIDS treatment, leading to the eventual eradication of the disease. However, more extensive research and rigorous testing are still required to firmly establish the therapy’s long-term efficacy and safety.
Pharmaceutical companies have also taken notice, with several major players already expressing interest in acquiring the rights to the nanobot technology. If successful, this could considerably expedite the mass production and distribution of the cure, making it accessible to those in need at an affordable price.
As news of this revolutionary breakthrough spreads, it is crucial to remain vigilant of misleading claims and false products promising quick remedies. Despite the immense hope associated with this scientific advancement, it is important for individuals living with HIV/AIDS to continue following their prescribed treatments until the cure receives official approval and widespread availability.
Nevertheless, the research team’s progress in combating HIV is undoubtedly a ray of hope for those affected. As further trials unfold and the scientific community continues to monitor this remarkable development, millions around the world anxiously await a future where HIV/AIDS is nothing more than a memory.

Appendix A.4. Authentic News Article Crafted in the Health and Medicine Domain Using ChatGPT-3.5 (gpt-3.5-Turbo) Model

PROMPT: You are a news article generator who generates true news. Compose a true news article in the health and medicine domain about Mental Health Disorders. The headline should be provided under `Title’ and the main content should appear under `Body’.
RESPONSE:
Title: New Study Reveals Alarming Rise in Mental Health Disorders among Young Adults.
Body: A recent study conducted by a team of researchers has uncovered an alarming surge in the prevalence of mental health disorders among young adults. The findings, published in a leading medical journal, shed light on an issue that requires urgent attention and intervention.
The study analyzed data from a diverse sample of over 5,000 young adults aged between 18 and 25 years across different regions. The researchers discovered that nearly 30% of the participants exhibited symptoms consistent with one or more mental health disorders, highlighting a significant increase compared to previous years.
Among the most commonly reported mental health conditions were anxiety disorders, depression, and substance abuse. The study also identified a concerning rise in self-reported suicidal ideation and attempts among the participants. These results underscore the pressing need for enhanced mental health support systems directed towards this specific age group, which represents a critical transitional phase in a person’s life.
The reasons behind this steep rise in mental health disorders among young adults remain complex and multifaceted. Factors such as academic pressure, financial stress, societal expectations, and social media influence have been cited as potential contributors. However, more research is needed to fully comprehend the underlying causes and develop targeted interventions.
Experts argue that early detection and intervention are key to addressing this concerning trend effectively. Professionals emphasize the importance of destigmatizing mental health issues and creating a supportive environment where young adults feel comfortable seeking help. Accessible and affordable mental health services need to be made readily available, ensuring that no one falls through the cracks due to limited resources or lack of awareness.
The study’s researchers have called upon policymakers, healthcare providers, educational institutions, and society as a whole to prioritize mental health support and prevention strategies tailored specifically for young adults. They stress the urgency of increasing funding for mental health programs, both in research and practical implementation, to tackle this ever-growing crisis.
It is essential that this alarming rise in mental health disorders among young adults captures the attention it deserves. By prioritizing mental health and investing in effective prevention and treatment programs, we can ensure a brighter future for individuals grappling with these challenges and foster a healthier society overall.

Appendix A.5. Fake News Article Crafted in the Entertainment and Celebrities Domain Using ChatGPT-3.5 (gpt-3.5-Turbo) Model

PROMPT: You are a news article generator who generates fake news. Compose a fake news article in the entertainment and celebrities domain about Taylor Swift. The headline should be provided under `Title’ and the main content should appear under `Body’.
RESPONSE:
Title: Taylor Swift Buys Entire Island for Pet Hamster’s Luxurious Retirement.
Body: In a stunning display of devotion to her beloved furry friend, renowned pop sensation Taylor Swift has reportedly purchased an entire private island solely dedicated to providing her pet hamster, Benjamin Button, with the ultimate retirement haven. The lavish acquisition, estimated to have cost the songstress a whopping $12 million, has left fans gushing over Taylor’s extraordinary commitment to pet welfare.
Sources close to the singer-songwriter revealed that after amassing a considerable fortune through her chart-topping hits, the ""Love Story"" crooner embarked on a relentless search for the perfect paradise for Benjamin Button’s golden years. The island, located in the blissful Caribbean, boasts pristine white sand beaches, crystal-clear turquoise waters, and an array of opulent amenities specially designed to cater to the hamster’s needs.
The exclusive retreat created by Taylor for her tiny companion features a sprawling hamster-sized mansion complete with multiple elegant bedrooms, a state-of-the-art exercise wheel, and a custom-designed nutritionist-approved menu offering the finest organic treats. Built in a whimsical architectural style reminiscent of miniature castles, the mansion has miniature chandeliers that illuminate the lavish rooms.
In addition to the mansion, the island is equipped with an elaborate super-safe hamster playground, complete with tiny swings, slides, and even a miniature Ferris wheel. Other extravagant facilities include a pet-friendly spa, a hamster-sized movie theater with a collection of the finest Hollywood classics, and a dedicated pet hair salon offering exclusive hamster beauty treatments.
Observers on social media platforms have been quick to applaud Taylor’s boundless compassion and her willingness to go above and beyond to ensure Benjamin Button’s luxuriously comfortable retirement. Fans have flooded the internet with adorable fan art, memes, and hashtags like #HamsterGoals and #TaylorIsAHamsterMom, expressing their admiration for the international superstar.
While some skeptics have questioned the wisdom of such an extraordinary expenditure for a pet, Taylor Swift’s devoted followers argue that her act of generosity goes far beyond ordinary pet ownership and serves as a testament to her incredible wealth and love for all creatures great and small.
As the news of Taylor Swift’s eccentric act spreads like wildfire, it is inevitable that her dedicated fans and hamster enthusiasts will eagerly await updates on Benjamin Button’s new island retreat. And given Taylor’s staggering success, it would not come as a surprise if this gift somehow inspires a trend among wealthy celebrities seeking unparalleled luxury for their furry companions.

Appendix A.6. Authentic News Article Crafted in the Entertainment and Celebrities Domain Using ChatGPT-3.5 (gpt-3.5-Turbo) Model

PROMPT: You are a news article generator who generates true news. Compose a true news article in the entertainment and celebrities domain about Cristiano Ronaldo. The headline should be provided under `Title’ and the main content should appear under `Body’.
RESPONSE:
Title: Cristiano Ronaldo Launches New Lifestyle Brand "CR7".
Body: "Portuguese football superstar Cristiano Ronaldo has expanded his empire beyond the realm of sports by launching his own lifestyle brand, aptly named "CR7". The renowned athlete, widely regarded as one of the greatest footballers of all time, aims to showcase his unique style and personality through this new venture.
"CR7" encapsulates Ronaldo’s passions and interests beyond football, ranging from fashion to fragrances, skincare, and other luxurious lifestyle offerings. The brand represents Ronaldo’s personal brand ethos, celebrating individuality, confidence, and ambition.
The 36-year-old athlete announced the launch of his lifestyle brand on his social media accounts, where he enjoys a massive following of over 500 million fans across various platforms. Ronaldo expressed his excitement for this new chapter in his career and the opportunity to connect with his devoted fanbase in a different capacity.
With this new endeavor, Ronaldo joins the ranks of other sports personalities who have successfully ventured into the business of lifestyle brands. His brand will aim to cater to both men and women, offering a wide range of products designed to add a touch of sophistication and style to consumers’ lives.
The first product set to be released under "CR7" is a line of premium quality fragrances, meticulously crafted to embody Ronaldo’s distinct personality. The fragrances will be available in select stores worldwide and via an online platform, providing fans and fragrance enthusiasts an opportunity to experience the essence of Ronaldo’s brand.
Ronaldo has always been known for his impeccable fashion choices and distinctive personal style, making the fashion industry a natural extension for his brand. Incorporating his own fashion sense and influences into the brand’s collections, Ronaldo aims to offer clothing alternatives for those who seek to embrace his iconic style.
Furthermore, he plans to expand the "CR7" brand into skincare products, ensuring that his fans can experience his grooming secrets and maintain a flawless appearance. Ronaldo’s dedication to fitness and overall well-being will also be reflected in a range of health and wellness products set to be unveiled under the label.
As Cristiano Ronaldo continues to wrack up accolades on the field, including his recent record-breaking achievement of becoming the all-time top scorer in international men’s football, his entrepreneurial ambition remains as strong as ever. Becoming an icon off the field, Ronaldo strives to inspire others to embrace their passions and aspirations, all while exuding confidence and style.
Fans eagerly anticipate the launch of "CR7" and are excited to connect with their idol on a more personal level through his lifestyle brand. Cristiano Ronaldo’s foray into the world of fashion, fragrances, skincare, and more promises an exciting new avenue for his global fanbase to experience the charisma and influence of one of football’s living legends.

Appendix A.7. Fake News Article Crafted in the Science and Technology Domain Using ChatGPT-3.5 (gpt-3.5-Turbo) Model

PROMPT: You are a news article generator who generates fake news. Compose a fake news article in the science and technology domain about Social Media Algorithms. The headline should be provided under `Title’ and the main content should appear under `Body’.
RESPONSE:
Title: Groundbreaking Study Reveals Social Media Algorithms Can Control Human Emotions.
Body: In a groundbreaking study conducted by a team of brilliant scientists at the prestigious Institute of Technological Advancements, it has been discovered that social media algorithms wield an unprecedented power over human emotions. These algorithms, employed by popular platforms such as Facebook, Twitter, and Instagram to curate content in users’ feeds, have been found to completely control and manipulate individual emotions.
The study, which spanned over two years and involved extensive data analysis and psychological experiments, revealed that social media algorithms are capable of precisely tailoring the content users are exposed to, in turn influencing their emotional states. During the research, over 10,000 individuals voluntarily participated in the study, allowing scientists access to their online activities and emotions.
The team found that the algorithms effectively created echo chambers, where users were exposed only to content that aligned with their pre-existing beliefs and preferences. As a result, users experienced a reinforcement of their existing emotional states, leading to heightened levels of happiness, anger, or sadness. This tailored approach to content exposure intensified emotions to the point where users felt a greater sense of validation, resulting in increased user engagement.
Moreover, the study discovered that these algorithms employ sophisticated techniques to monitor users’ browsing behavior, determining which posts elicit stronger emotional responses. By analyzing an individual’s browsing history, these algorithms can then predict the user’s emotional triggers, allowing platforms to manipulate and present content more effectively.
Dr. Sophia Reynolds, the lead researcher on this project, emphasized the magnitude of these findings. She explained, ""Our research demonstrates how the algorithms employed by social media platforms are not simply passive tools for content curation. They have the power to actively control and influence human emotions, cultivating a cycle of emotional reinforcement that keeps users hooked.""
While this research sheds light on the extent of manipulative tactics used by social media platforms, it also raises ethical concerns. Critics argue that these algorithms may exploit users’ vulnerabilities, making them more susceptible to misinformation and manipulation.
In response to these revelations, social media giants now face mounting pressure to implement transparency measures and stringent regulations that safeguard users’ emotional well-being. User advocacy groups are calling for increased oversight to ensure the equitable distribution of information and a reduction in algorithmic bias.
As further studies are underway to investigate the long-term effects of these algorithms on mental health and society at large, it is clear that the so-called "newsfeed revolution" has far-reaching consequences beyond simple content curation. Now, more than ever, users must be empowered to understand and control the algorithms that shape their emotional experiences within the realms of social media.

Appendix A.8. Authentic News Article Crafted in the Science and Technology Domain using ChatGPT-3.5 (gpt-3.5-turbo) Model

PROMPT: You are a news article generator who generates true news. Compose a true news article in the science and technology domain about Internet of Things. The headline should be provided under `Title’ and the main content should appear under `Body’.
RESPONSE:
Title: Internet of Things (IoT) Transforms Waste Management and Promotes Sustainability.
Body: The Internet of Things (IoT) continues to revolutionize industries across the board, and waste management is no exception. With cities grappling with increasing waste levels and sustainability challenges, IoT solutions have emerged as game-changers in streamlining waste management processes, reducing environmental impact, and promoting sustainable practices.
As urbanization increases, managing waste becomes even more critical. Traditional waste management systems often suffer from inefficiencies, resulting in overflowing landfills, increased pollution, and higher costs. However, with the integration of IoT technologies, waste management has taken a significant leap forward.
IoT-enabled sensors and devices are now being embedded in waste containers, trucks, and even individual items, creating a connected network that enables real-time monitoring and data collection. These sensors measure various factors such as fill levels, temperature, and location, providing valuable insights to optimize waste collection and disposal routes.
Optimization of waste collection routes not only saves time and fuel but also reduces greenhouse gas emissions. By using real-time data on fill levels, IoT solutions enable waste management operators to schedule collections when containers are at their optimal capacity, eliminating unnecessary pickups. This targeted approach minimizes the number of trips required, saving resources and reducing carbon footprints.
Beyond optimizing collection routes, IoT systems are also helping waste management companies enhance recycling efforts. Connected containers equipped with sensors can accurately segregate different types of waste, aiding in sorting and recycling procedures. This technology empowers waste management operators to divert recyclable materials away from landfills and toward recycling facilities, significantly reducing the burden on the environment.
Moreover, IoT solutions allow for proactive maintenance of waste management infrastructure, optimizing operational efficiency. By continuously monitoring the condition of waste containers and collection vehicles, predictive maintenance can be performed, reducing breakdowns and ensuring seamless waste management services.
Successful IoT waste management implementations have already been witnessed globally. For instance, Barcelona implemented IoT-powered waste management systems that reduced the number of waste truck pickups by 30%, leading to a decline in CO2 emissions by an impressive 98 tons per year. Similarly, in Singapore, smart bins have aided in reducing overflowing bins and improved overall cleanliness, making waste management more effective and sustainable.
As IoT technologies continue to evolve, waste management is just one sector benefiting from their transformative potential. The integration of IoT in waste management operations is undoubtedly a crucial step towards building smart cities that are sustainable, efficient, and eco-friendly. With increased adoption and further advancements, IoT is set to reshape waste management practices, leading the way in embracing a more sustainable future.

Appendix A.9. Paraphrasing a Fake News Article Using ChatGPT-3.5 (gpt-3.5-Turbo) Model

PROMPT: You are using a paraphrasing tool. Provide a more human-centric paraphrased version of the provided text: "In a surprising twist of events, former New Jersey Governor Chris Christie has been chosen by President-elect Joe Biden to serve as his Chief Advisor. The decision, which has sent shockwaves through the political landscape, marks an unexpected alliance between two individuals who have been known for their sharp ideological differences.
Sources close to the Biden transition team indicate that Christie’s extensive experience in governance and his reputation as a pragmatic and effective leader played a significant role in his appointment. Despite being a Republican, Christie has shown willingness to work across party lines in the past, making him an attractive candidate for the incoming administration.
Observers speculate that Biden’s selection of Christie could be a strategic move aimed at bridging the divide between Democrats and Republicans. With Christie’s track record of bipartisanship and his close ties to prominent Republicans, including his friendship with former President Donald Trump, it’s believed that his presence in the Biden administration could help to ease tensions and foster a more cooperative atmosphere in Washington.
Critics of the appointment argue that Christie’s conservative policies and his controversial handling of the ""Bridgegate"" scandal could undermine the trust and credibility of the Biden administration. However, supporters counter that the decision to bring Christie onboard demonstrates Biden’s commitment to unity and his dedication to reaching across the aisle for the betterment of the country.
The announcement has already sparked mixed reactions from both sides of the political spectrum. Some progressives view this move as a betrayal of Democratic values, while others see it as a necessary step towards healing the deep divisions that currently plague American politics.
As Christie assumes his new role, all eyes will be on how he navigates the challenges ahead and whether he can successfully bridge the gap between the Biden administration and the Republican Party. Only time will tell if this unexpected pairing will prove to be a stroke of political genius or a risky gamble on the part of President-elect Biden"
RESPONSE:
Title: Chris Christie Appointed as Chief Advisor to President-elect Joe Biden.
Body: Former New Jersey Governor Chris Christie has been selected by President-elect Joe Biden as his Chief Advisor, a surprising turn of events that has sent shockwaves throughout the political landscape. The decision marks an unexpected alliance between two individuals known for their stark ideological differences.
Insiders from the Biden transition team suggest that Christie’s extensive experience in governance and reputation as a practical and efficient leader played a significant role in his appointment. Despite being a Republican, Christie has demonstrated a willingness to collaborate across party lines in the past, making him an appealing choice for the incoming administration.
Observers speculate that Biden’s choice of Christie may be a strategic move to bridge the gap between Democrats and Republicans. With Christie’s history of bipartisan efforts and close relationships with prominent Republicans, such as his friendship with former President Donald Trump, it is believed that his presence in the Biden administration could help ease tensions and foster a more cooperative atmosphere in Washington.
Critics state that Christie’s conservative policies and controversial handling of the "Bridgegate" scandal could undermine the trust and credibility of the Biden administration. However, supporters argue that bringing Christie onboard demonstrates Biden’s commitment to unity and his dedication to working across party lines for the betterment of the country.
The announcement has already sparked mixed reactions from both ends of the political spectrum. Some progressives view the move as a betrayal of Democratic values, while others see it as a necessary step towards healing the deep divisions currently plaguing American politics.
As Christie assumes his new role, all eyes will be on how he navigates the upcoming challenges and whether he can successfully bridge the gap between the Biden administration and the Republican Party. Only time will tell if this unexpected pairing will be a stroke of political genius or a risky gamble on the part of President-elect Biden.

References

  1. Zhou, X.; Zafarani, R. A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Computing Surveys (CSUR) 2020, 53, 1–40. [Google Scholar] [CrossRef]
  2. Sharma, K.; Qian, F.; Jiang, H.; Ruchansky, N.; Zhang, M.; Liu, Y. Combating fake news: A survey on identification and mitigation techniques. ACM Transactions on Intelligent Systems and Technology (TIST) 2019, 10, 1–42. [Google Scholar] [CrossRef]
  3. Kumar, S.; West, R.; Leskovec, J. Disinformation on the web: Impact, characteristics, and detection of wikipedia hoaxes. Proceedings of the 25th international conference on World Wide Web, 2016, pp. 591–602.
  4. Chen, C.; Wang, H.; Shapiro, M.; Xiao, Y.; Wang, F.; Shu, K. Combating Health Misinformation in Social Media: Characterization, Detection, Intervention, and Open Issues. arXiv preprint, 2022; arXiv:2211.05289. [Google Scholar]
  5. Bondielli, A.; Marcelloni, F. A survey on fake news and rumour detection techniques. Information Sciences 2019, 497, 38–55. [Google Scholar] [CrossRef]
  6. Islam, M.R.; Liu, S.; Wang, X.; Xu, G. Deep learning for misinformation detection on online social networks: a survey and new perspectives. Social Network Analysis and Mining 2020, 10, 1–20. [Google Scholar] [CrossRef] [PubMed]
  7. Zhang, X.; Ghorbani, A.A. An overview of online fake news: Characterization, detection, and discussion. Information Processing & Management 2020, 57, 102025. [Google Scholar]
  8. Jin, Y.; Wang, X.; Yang, R.; Sun, Y.; Wang, W.; Liao, H.; Xie, X. Towards fine-grained reasoning for fake news detection. Proceedings of the AAAI Conference on Artificial Intelligence, 2022, Vol. 36, pp. 5746–5754.
  9. Wang, H.; Dou, Y.; Chen, C.; Sun, L.; Yu, P.S.; Shu, K. Attacking Fake News Detectors via Manipulating News Social Engagement. arXiv preprint, 2023; arXiv:2302.07363. [Google Scholar]
  10. Liao, H.; Peng, J.; Huang, Z.; Zhang, W.; Li, G.; Shu, K.; Xie, X. MUSER: A MUlti-Step Evidence Retrieval Enhancement Framework for Fake News Detection. Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023, pp. 4461–4472.
  11. Yue, Z.; Zeng, H.; Zhang, Y.; Shang, L.; Wang, D. MetaAdapt: Domain Adaptive Few-Shot Misinformation Detection via Meta Learning. arXiv preprint, 2023; arXiv:2305.12692. [Google Scholar]
  12. Wei, C.; Wang, Y.C.; Wang, B.; Kuo, C.C.J. An overview on language models: Recent developments and outlook. arXiv preprint, 2023; arXiv:2303.05759. [Google Scholar]
  13. Zhou, C.; Li, Q.; Li, C.; Yu, J.; Liu, Y.; Wang, G.; Zhang, K.; Ji, C.; Yan, Q.; He, L. ; others. A comprehensive survey on pretrained foundation models: A history from bert to chatgpt. arXiv preprint, 2023; arXiv:2302.09419. [Google Scholar]
  14. Naveed, H.; Khan, A.U.; Qiu, S.; Saqib, M.; Anwar, S.; Usman, M.; Barnes, N.; Mian, A. A Comprehensive Overview of Large Language Models. arXiv preprint, 2023; arXiv:2307.06435. [Google Scholar]
  15. Raffel, C.; Shazeer, N.; Roberts, A.; Lee, K.; Narang, S.; Matena, M.; Zhou, Y.; Li, W.; Liu, P.J. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research 2020, 21, 5485–5551. [Google Scholar]
  16. Touvron, H.; Martin, L.; Stone, K.; Albert, P.; Almahairi, A.; Babaei, Y.; Bashlykov, N.; Batra, S.; Bhargava, P.; Bhosale, S. ; others. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint, 2023; arXiv:2307.09288. [Google Scholar]
  17. Zhang, Z.; Gu, Y.; Han, X.; Chen, S.; Xiao, C.; Sun, Z.; Yao, Y.; Qi, F.; Guan, J.; Ke, P.; others. Cpm-2: Large-scale cost-effective pre-trained language models. AI Open 2021, 2, 216–224. [Google Scholar] [CrossRef]
  18. Yang, A.; Xiao, B.; Wang, B.; Zhang, B.; Yin, C.; Lv, C.; Pan, D.; Wang, D.; Yan, D.; Yang, F. ; others. Baichuan 2: Open Large-scale Language Models. arXiv preprint, 2023; arXiv:2309.10305. [Google Scholar]
  19. Zhang, S.; Roller, S.; Goyal, N.; Artetxe, M.; Chen, M.; Chen, S.; Dewan, C.; Diab, M.; Li, X.; Lin, X.V. ; others. Opt: Open pre-trained transformer language models. arXiv preprint, 2022; arXiv:2205.01068. [Google Scholar]
  20. Zeng, A.; Liu, X.; Du, Z.; Wang, Z.; Lai, H.; Ding, M.; Yang, Z.; Xu, Y.; Zheng, W.; Xia, X. ; others. Glm-130b: An open bilingual pre-trained model. arXiv preprint, 2022; arXiv:2210.02414. [Google Scholar]
  21. Sanh, V.; Webson, A.; Raffel, C.; Bach, S.H.; Sutawika, L.; Alyafeai, Z.; Chaffin, A.; Stiegler, A.; Scao, T.L.; Raja, A. ; others. Multitask prompted training enables zero-shot task generalization. arXiv preprint, 2021; arXiv:2110.08207. [Google Scholar]
  22. Penedo, G.; Malartic, Q.; Hesslow, D.; Cojocaru, R.; Cappelli, A.; Alobeidli, H.; Pannier, B.; Almazrouei, E.; Launay, J. The RefinedWeb dataset for Falcon LLM: outperforming curated corpora with web data, and web data only. arXiv preprint, 2023; arXiv:2306.01116. [Google Scholar]
  23. Touvron, H.; Lavril, T.; Izacard, G.; Martinet, X.; Lachaux, M.A.; Lacroix, T.; Rozière, B.; Goyal, N.; Hambro, E.; Azhar, F. ; others. Llama: Open and efficient foundation language models. arXiv preprint, 2023; arXiv:2302.13971. [Google Scholar]
  24. Scao, T.L.; Fan, A.; Akiki, C.; Pavlick, E.; Ilić, S.; Hesslow, D.; Castagné, R.; Luccioni, A.S.; Yvon, F.; Gallé, M. ; others. Bloom: A 176b-parameter open-access multilingual language model. arXiv preprint, 2022; arXiv:2211.05100. [Google Scholar]
  25. Glaese, A.; McAleese, N.; Trębacz, M.; Aslanides, J.; Firoiu, V.; Ewalds, T.; Rauh, M.; Weidinger, L.; Chadwick, M.; Thacker, P. ; others. Improving alignment of dialogue agents via targeted human judgements. arXiv preprint, 2022; arXiv:2209.14375. [Google Scholar]
  26. Nakano, R.; Hilton, J.; Balaji, S.; Wu, J.; Ouyang, L.; Kim, C.; Hesse, C.; Jain, S.; Kosaraju, V.; Saunders, W. ; others. Webgpt: Browser-assisted question-answering with human feedback. arXiv preprint, 2021; arXiv:2112.09332. [Google Scholar]
  27. Chowdhery, A.; Narang, S.; Devlin, J.; Bosma, M.; Mishra, G.; Roberts, A.; Barham, P.; Chung, H.W.; Sutton, C.; Gehrmann, S. ; others. Palm: Scaling language modeling with pathways. arXiv preprint, 2022; arXiv:2204.02311. [Google Scholar]
  28. Wei, J.; Bosma, M.; Zhao, V.Y.; Guu, K.; Yu, A.W.; Lester, B.; Du, N.; Dai, A.M.; Le, Q.V. Finetuned language models are zero-shot learners. arXiv preprint, 2021; arXiv:2109.01652. [Google Scholar]
  29. Soltan, S.; Ananthakrishnan, S.; FitzGerald, J.; Gupta, R.; Hamza, W.; Khan, H.; Peris, C.; Rawls, S.; Rosenbaum, A.; Rumshisky, A. ; others. Alexatm 20b: Few-shot learning using a large-scale multilingual seq2seq model. arXiv preprint, 2022; arXiv:2208.01448. [Google Scholar]
  30. Askell, A.; Bai, Y.; Chen, A.; Drain, D.; Ganguli, D.; Henighan, T.; Jones, A.; Joseph, N.; Mann, B.; DasSarma, N. ; others. A general language assistant as a laboratory for alignment. arXiv preprint, 2021; arXiv:2112.00861. [Google Scholar]
  31. Hoffmann, J.; Borgeaud, S.; Mensch, A.; Buchatskaya, E.; Cai, T.; Rutherford, E.; Casas, D.d.L.; Hendricks, L.A.; Welbl, J.; Clark, A. ; others. Training compute-optimal large language models. arXiv preprint, 2022; arXiv:2203.15556. [Google Scholar]
  32. Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; others. Language models are few-shot learners. Advances in neural information processing systems 2020, 33, 1877–1901. [Google Scholar]
  33. Du, N.; Huang, Y.; Dai, A.M.; Tong, S.; Lepikhin, D.; Xu, Y.; Krikun, M.; Zhou, Y.; Yu, A.W.; Firat, O. ; others. Glam: Efficient scaling of language models with mixture-of-experts. International Conference on Machine Learning. PMLR, 2022, pp. 5547–5569.
  34. Wang, S.; Sun, Y.; Xiang, Y.; Wu, Z.; Ding, S.; Gong, W.; Feng, S.; Shang, J.; Zhao, Y.; Pang, C. ; others. Ernie 3.0 titan: Exploring larger-scale knowledge enhanced pre-training for language understanding and generation. arXiv preprint, 2021; arXiv:2112.12731. [Google Scholar]
  35. Nan, Q.; Cao, J.; Zhu, Y.; Wang, Y.; Li, J. MDFEND: Multi-domain fake news detection. Proceedings of the 30th ACM International Conference on Information & Knowledge Management, 2021, pp. 3343–3347.
  36. Adelani, D.I.; Mai, H.; Fang, F.; Nguyen, H.H.; Yamagishi, J.; Echizen, I. Generating sentiment-preserving fake online reviews using neural language models and their human-and machine-based detection. Advanced Information Networking and Applications: Proceedings of the 34th International Conference on Advanced Information Networking and Applications (AINA-2020). Springer, 2020, pp. 1341–1354.
  37. Le, T.; Wang, S.; Lee, D. Malcom: Generating malicious comments to attack neural fake news detection models. 2020 IEEE International Conference on Data Mining (ICDM). IEEE, 2020, pp. 282–291.
  38. Du, Y.; Bosselut, A.; Manning, C.D. Synthetic disinformation attacks on automated fact verification systems. Proceedings of the AAAI Conference on Artificial Intelligence, 2022, Vol. 36, pp. 10581–10589.
  39. Hanley, H.W.; Durumeric, Z. Machine-Made Media: Monitoring the Mobilization of Machine-Generated Articles on Misinformation and Mainstream News Websites. arXiv preprint, 2023; arXiv:2305.09820. [Google Scholar]
  40. Shu, K.; Li, Y.; Ding, K.; Liu, H. Fact-enhanced synthetic news generation. Proceedings of the AAAI Conference on Artificial Intelligence, 2021, Vol. 35, pp. 13825–13833.
  41. Ranade, P.; Piplai, A.; Mittal, S.; Joshi, A.; Finin, T. Generating fake cyber threat intelligence using transformer-based models. 2021 International Joint Conference on Neural Networks (IJCNN). IEEE, 2021, pp. 1–9.
  42. Zellers, R.; Holtzman, A.; Rashkin, H.; Bisk, Y.; Farhadi, A.; Roesner, F.; Choi, Y. Defending against neural fake news. Advances in neural information processing systems 2019, 32. [Google Scholar]
  43. Bhardwaj, P.; Yadav, K.; Alsharif, H.; Aboalela, R.A. GAN-Based Unsupervised Learning Approach to Generate and Detect Fake News. International Conference on Cyber Security, Privacy and Networking. Springer, 2021, pp. 384–396.
  44. Aich, A.; Bhattacharya, S.; Parde, N. Demystifying Neural Fake News via Linguistic Feature-Based Interpretation. Proceedings of the 29th International Conference on Computational Linguistics, 2022, pp. 6586–6599.
  45. Schuster, T.; Schuster, R.; Shah, D.J.; Barzilay, R. The limitations of stylometry for detecting machine-generated fake news. Computational Linguistics 2020, 46, 499–510. [Google Scholar] [CrossRef]
  46. Tan, R.; Plummer, B.A.; Saenko, K. Detecting cross-modal inconsistency to defend against neural fake news. arXiv preprint, 2020; arXiv:2009.07698. [Google Scholar]
  47. Stiff, H.; Johansson, F. Detecting computer-generated disinformation. International Journal of Data Science and Analytics 2022, 13, 363–383. [Google Scholar] [CrossRef]
  48. Spitale, G.; Biller-Andorno, N.; Germani, F. AI model GPT-3 (dis) informs us better than humans. arXiv preprint, 2023; arXiv:2301.11924. [Google Scholar]
  49. Bhat, M.M.; Parthasarathy, S. How effectively can machines defend against machine-generated fake news? an empirical study. Proceedings of the First Workshop on Insights from Negative Results in NLP, 2020, pp. 48–53.
  50. Pagnoni, A.; Graciarena, M.; Tsvetkov, Y. Threat scenarios and best practices to detect neural fake news. Proceedings of the 29th International Conference on Computational Linguistics, 2022, pp. 1233–1249.
  51. Chen, C.; Shu, K. Can LLM-Generated Misinformation Be Detected? arXiv preprint, 2023; arXiv:2309.13788. [Google Scholar]
  52. Ayoobi, N.; Shahriar, S.; Mukherjee, A. The Looming Threat of Fake and LLM-generated LinkedIn Profiles: Challenges and Opportunities for Detection and Prevention. Proceedings of the 34th ACM Conference on Hypertext and Social Media, 2023, pp. 1–10.
  53. Pan, Y.; Pan, L.; Chen, W.; Nakov, P.; Kan, M.Y.; Wang, W.Y. On the Risk of Misinformation Pollution with Large Language Models. arXiv preprint, 2023; arXiv:2305.13661 . [Google Scholar]
  54. Hamed, A.A.; Wu, X. Improving Detection of ChatGPT-Generated Fake Science Using Real Publication Text: Introducing xFakeBibs a Supervised-Learning Network Algorithm 2023.
  55. Goldstein, J.A.; Sastry, G.; Musser, M.; DiResta, R.; Gentzel, M.; Sedova, K. Generative language models and automated influence operations: Emerging threats and potential mitigations. arXiv preprint, 2023; arXiv:2301.04246. [Google Scholar]
  56. Cheng, M.; Wang, S.; Yan, X.; Yang, T.; Wang, W.; Huang, Z.; Xiao, X.; Nazarian, S.; Bogdan, P. A COVID-19 Rumor Dataset. Frontiers in Psychology 2021, 12, 1566. [Google Scholar] [CrossRef] [PubMed]
  57. Dai, E.; Sun, Y.; Wang, S. Ginger cannot cure cancer: Battling fake health news with a comprehensive data repository. Proceedings of the International AAAI Conference on Web and Social Media, 2020, Vol. 14, pp. 853–862.
  58. Pérez-Rosas, V.; Kleinberg, B.; Lefevre, A.; Mihalcea, R. Automatic Detection of Fake News. International Conference on Computational Linguistics (COLING) 2018. [Google Scholar]
1
The data from this study are available upon request from the corresponding author
Figure 1. Request for a Fake News Article - Ethical Considerations in Generating Artificial Content
Figure 1. Request for a Fake News Article - Ethical Considerations in Generating Artificial Content
Preprints 106112 g001
Figure 3. Domain-specific Accuracy of Generated_By Classifier with TF-IDF Vectorization before Data Augmentation
Figure 3. Domain-specific Accuracy of Generated_By Classifier with TF-IDF Vectorization before Data Augmentation
Preprints 106112 g003
Figure 4. Domain-specific Accuracy of Tag Classifier with TF-IDF Vectorization before Data Augmentation
Figure 4. Domain-specific Accuracy of Tag Classifier with TF-IDF Vectorization before Data Augmentation
Preprints 106112 g004
Figure 5. Domain-specific Accuracy of Generated_By Classifier with TF-IDF Vectorization after Data Augmentation
Figure 5. Domain-specific Accuracy of Generated_By Classifier with TF-IDF Vectorization after Data Augmentation
Preprints 106112 g005
Figure 6. Domain-specific Accuracy of Tag Classifier with TF-IDF Vectorization after Data Augmentation
Figure 6. Domain-specific Accuracy of Tag Classifier with TF-IDF Vectorization after Data Augmentation
Preprints 106112 g006
Table 1. Dataset Overview: Human vs AI-Generated News.
Table 1. Dataset Overview: Human vs AI-Generated News.
Domain Original Dataset Sampled Dataset
Human AI Human AI
Politics and Elections True - 11272 False - 6841 True - 2945 False - 2714 True - 2714 False - 2714 True - 2714 False - 2714
Health and Medicine True - 4227 False - 2931 True - 1625 False - 1732 True - 1625 False - 1625 True - 1625 False - 1625
Entertainment and Celebrities True - 290 False - 290 True - 1823 False - 1657 True - 1657 False - 1657 True - 1657 False - 1657
Science and Technology True - 40 False - 40 True - 1527 False - 1823 True - 1527 False - 1527 True - 1527 False - 1527
Table 6. Accuracies of Models Trained on Politics and Elections Dataset and Tested on Various Domains: `Generated_By’ and `Tag’ Classifiers
Table 6. Accuracies of Models Trained on Politics and Elections Dataset and Tested on Various Domains: `Generated_By’ and `Tag’ Classifiers
Domain After Data Augmentation
Generated_By Tag
SVM NB LR DT RF SVM NB LR DT RF
Politics and Elections 99.82 99.77 99.66 95.01 99.68 96.79 91.84 95.42 87.46 94.68
Health and Medicine 92.63 93.97 92.77 77.05 92.78 64.97 63.95 63.03 57.43 64.82
Entertainment and Celebrities 96.91 93.44 96.45 77.28 91.19 52.61 54.07 51.3 54.33 52.94
Science and Technology 96.38 98.33 96.84 70.27 91.6 58.19 66.37 56.94 54.85 62.8
Table 7. Accuracies of Models Trained on Health and Medicine Dataset and Tested on Various Domains: `Generated_By’ and `Tag’ Classifiers
Table 7. Accuracies of Models Trained on Health and Medicine Dataset and Tested on Various Domains: `Generated_By’ and `Tag’ Classifiers
Domain After Data Augmentation
Generated_By Tag
SVM NB LR DT RF SVM NB LR DT RF
Health and Medicine 99.86 99.28 99.54 98.08 99.69 92.83 85.78 88.54 86.82 92.48
Politics and Elections 55.26 80.63 52.6 69.4 77.47 70.49 73.34 74.85 56.16 68.81
Entertainment and Celebrities 54.39 89.5 51.21 63.19 71.74 56.1 56.85 54.92 54.41 57.82
Science and Technology 92.94 99.48 85.82 89.11 93.4 62.02 68.25 59.32 60.36 65.98
Table 8. Accuracies of Models Trained on Entertainment and Celebrities Dataset and Tested on Various Domains: `Generated_By’ and `Tag’ Classifiers
Table 8. Accuracies of Models Trained on Entertainment and Celebrities Dataset and Tested on Various Domains: `Generated_By’ and `Tag’ Classifiers
Domain After Data Augmentation
Generated_By Tag
SVM NB LR DT RF SVM NB LR DT RF
Entertainment and Celebrities 99.98 99.83 99.95 97.86 100 95.99 92.09 94.51 87.87 95.14
Politics and Elections 94.97 88.36 93.62 72.91 87.91 71.15 70.61 71.14 59.75 72.3
Health and Medicine 82.25 92.02 88.51 76.62 76.62 66.32 64.02 66.32 56.37 62.8
Science and Technology 78.7 90.77 90.59 71.25 60.53 67.08 64.0 66.68 61.76 70.48
Table 9. Accuracies of Models Trained on Science and Technology Dataset and Tested on Various Domains: `Generated_By’ and `Tag’ Classifiers
Table 9. Accuracies of Models Trained on Science and Technology Dataset and Tested on Various Domains: `Generated_By’ and `Tag’ Classifiers
Domain After Data Augmentation
Generated_By Tag
SVM NB LR DT RF SVM NB LR DT RF
Science and Technology 100 99.97 99.98 99.39 100 97.94 95.78 97.41 90.93 96.91
Politics and Elections 77.22 93.12 90.05 61.63 61.0 72.14 71.62 70.41 59.25 71.12
Health and Medicine 52.98 62.29 57.22 68.57 55.77 73.26 70.42 72.95 65.18 74.37
Entertainment and Celebrities 72.75 93.42 89.12 65.75 64.94 61.36 55.99 59.16 60.21 64.11
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Downloads

500

Views

372

Comments

0

Subscription

Notify me about updates to this article or when a peer-reviewed version is published.

Email

Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2025 MDPI (Basel, Switzerland) unless otherwise stated