Submitted:
29 November 2025
Posted:
01 December 2025
You are already at the latest version
Abstract
Keywords:
Introduction
Background and Context
Problem Statement
Purpose of the Study
Literature Review
Linguistic and Affective Factors Underlying Discrepancies
Limitations and Research Gaps
Methods
Research Question
Hypothesis Testing
Data
Outcome Variable: Prediction Status (Match/Mismatch)
Predictor Variables: LIWC Features
Predictor Variables: VADER Features
Exploratory Data Analysis


Relationship Between Predictor Variables



Model and Results

Model Explanation and Diagnostics



Conclusion
General Discussion
Code Availability Statement
Data Availability Statement
Acknowledgments
References
- Blodgett, J.; Yang, K.; Stokes, R.; Galiatsatos, P.; 1. Public Health Advocacy Dataset (PHAD). University of Arkansas Computer Vision and Image Understanding (CVIU) Lab. n.d. Available online: https://uark-cviu.github.io/projects/PHAD/#annotations.
- Blodgett, S. L.; Barocas, S.; Daumé, H., III; Wallach, H. Language (technology) is power: A critical survey of “bias” in NLP. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020; pp. 5454–5476. [Google Scholar] [CrossRef]
- Boyd, R. L.; Ashokkumar, A.; Seraj, S.; Pennebaker, J. W. The development and psychometric properties of LIWC-22; University of Texas at Austin, 2022. [Google Scholar]
- Centers for Disease Control and Prevention. Youth and Tobacco Use. 2024. Available online: https://www.cdc.gov/tobacco/php/data-statistics/youth-data-tobacco/index.html.
- Cero, I.; Luo, J.; Falligant, J. M. Lexicon-based sentiment analysis in behavioral research. Perspectives on behavior science 2024, 47(1), 283–310. Available online: https://link.springer.com/article/10.1007/s40614-023-00394-x. [CrossRef] [PubMed]
- Guess, A. M.; Malhotra, N.; Pan, J.; Barberá, P.; Allcott, H.; Brown, T.; Tucker, J. A. How do social media feed algorithms affect attitudes and behavior in an election campaign? Science 2023, 381(6656), 398–404. [Google Scholar] [CrossRef] [PubMed]
- Hutto, C. J.; Gilbert, E. VADER: A parsimonious rule-based model for sentiment analysis of social media text. Proceedings of the International AAAI Conference on Web and Social Media 2014, Vol. 8(No. 1), 216–225. Available online: https://ojs.aaai.org/index.php/ICWSM/article/view/14550. [CrossRef]
- Jim, J. R.; Talukder, M. A. R.; Malakar, P.; Kabir, M. M.; Nur, K.; Mridha, M. F. Recent advancements and challenges of NLP-based sentiment analysis: A state-of-the-art review. Natural Language Processing Journal 2024, 6, 100059. Available online: https://www.sciencedirect.com/science/article/pii/S2949719124000074. [CrossRef]
- Jim, J. R.; Talukder, M. A. R.; Malakar, P.; Kabir, M. M.; Nur, K.; Mridha, M. F. Recent advancements and challenges of NLP-based sentiment analysis: A state-of-the-art review. Natural Language Processing Journal 2024, 6, 100059. Available online: https://www.sciencedirect.com/science/article/pii/S2949719124000074. [CrossRef]
- Kemp, S. Digital 2020: Global Digital Overview 2020, 10. Available online: https://datareportal.com/reports/digital-2020-global-digital-overview.
- Kemp, S. Digital 2025: Global Overview Report 2025, 11. Available online: https://datareportal.com/reports/digital-2025-global-overview-report.
- Lee, J; Krishnan-Sarin, S; Kong, G. Social Media Use and Subsequent E-Cigarette Susceptibility, Initiation, and Continued Use Among US Adolescents. Available online: https://www.cdc.gov/pcd/issues/2023/22_0415.htm.
- Lee, J.; Tan, A. S.; Porter, L.; Young-Wolff, K. C.; Carter-Harris, L.; Salloum, R. G. Association between social media use and vaping among Florida adolescents, 2019. Preventing chronic disease 2021, 18, E49. [Google Scholar] [CrossRef] [PubMed]
- Margolis, K. A.; Donaldson, E. A.; Portnoy, D. B.; Robinson, J.; Neff, L. J.; Jamal, A. E-cigarette openness, curiosity, harm perceptions and advertising exposure among U.S. middle and high school students. Preventive Medicine 2018, 112, 119–125. [Google Scholar] [CrossRef] [PubMed]
- Narayanan, A. Understanding social media recommendation algorithms. Journal of Digital Media & Policy 2023, 14(1), 22–39. [Google Scholar] [CrossRef]
- Office of the Surgeon General. Social media and youth mental health: The U.S. Surgeon General’s advisory. U.S. Department of Health and Human Services. 2023. Available online: https://www.ncbi.nlm.nih.gov/books/NBK594759/.
- Öhman, E. The validity of lexicon-based sentiment analysis in interdisciplinary research. In Proceedings of the workshop on natural language processing for digital humanities, 2021, December; pp. 7–12. Available online: https://aclanthology.org/2021.nlp4dh-1.2/.
- Öhman, E.; Persson, J. The limits of lexicon-based sentiment analysis for social media text. Journal of Language Technology and Computation 2021, 36(2), 45–63. Available online: https://jltec.org/articles/2021-limitations-sentiment-analysis.
- Reagan, A.J.; Mitchell, L.; Kiley, D.; et al. The emotional arcs of stories are dominated by six basic shapes. EPJ Data Sci. 2016, 5, 31. [Google Scholar] [CrossRef]
- Statista. Social media advertising spending worldwide from 2017 to 2025. 2024. Available online: https://www.statista.com/statistics/271406/advertising-revenue-of-social-networks-worldwide/.
- Tavernor, J.; El-Tawil, Y.; Mower-Provost, E. The whole is bigger than the sum of its parts: Modeling individual annotators to capture emotional variability. arXiv. 2024. Available online: https://arxiv.org/abs/2408.11956.
- U.S. Department of Health and Human Services. Social media and youth mental health: The U.S. Surgeon General’s advisory. 2023. Available online: https://www.hhs.gov/sites/default/files/sg-youth-mental-health-social-media-advisory.pdf.
- University of Arkansas CVIU Lab. Public Health Advocacy Dataset (PHAD) [Dataset]. n.d. Available online: https://uark-cviu.github.io/projects/PHAD/#annotations.
- Unsloth, AI. Unsloth: Efficient fine-tuning for large language models. 2024. Available online: https://github.com/unslothai/unsloth.
- Venrick, S. J.; Kelley, D. E.; O’Brien, E.; Margolis, K. A.; Navarro, M. A.; Alexander, J. P.; O’Donnell, A. N. U.S. digital tobacco marketing and youth: A narrative review. Preventive Medicine Reports 31 2022, 102094. [Google Scholar] [CrossRef] [PubMed]
- Vogel, E. A.; Barrington-Trimis, J. L.; Vassey, J.; Soto, D.; Unger, J. B. Young adults’ exposure to and engagement with tobacco-related social media content and subsequent tobacco use. Nicotine & Tobacco Research 2024, 26 Supplement 1, S3–S12. [Google Scholar] [CrossRef] [PubMed]
- Williams, R. Chipotle smashes TikTok records with #GuacDance challenge. Marketing Dive. 2 August 2019. Available online: https://www.marketingdive.com/news/chipotle-smashes-tiktok-records-with-guacdance-challenge/560102/.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).