Submitted:
09 August 2025
Posted:
11 August 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
Theoretical Framework
2. Literature Review
Literature Review
Inflation and Consumer Behavior
Self-Care as an Economic Phenomenon
Digital Finance Discourse
NLP for Behavioral Insight
3. Researcher Positionality
Researcher Positionality
Real-World Case Study
4. Research Questions and Hypotheses
- RQ1:
- How do financial discussions online reflect self-care and coping during inflation?This question stems from the recognition that self-care in times of financial stress often takes unconventional forms—sometimes it’s a purchase, sometimes it’s a behavior, sometimes it’s an attitude. The goal is to capture this breadth in real-world discussions.
- RQ2:
- Which topics show the most extreme sentiment and uncertainty?This is rooted in the idea that not all financial topics evoke the same emotional response. Some issues, like scams or medical debt, might carry heavier emotional weight and higher uncertainty than others, such as refunds or small windfalls.
- RQ13:
- Can a fine-tuned transformer reliably detect self-care in finance conversations?This connects the human element to the technical challenge—can advanced NLP tools not only find relevant discussions but do so with a level of accuracy that captures nuance?
Data Exploration & Preprocessing
5. Methods
Introduction to Methods
Data & Labeling
Splits & Seed
Model & Unsloth
Class Imbalance Handling
- Stratified Splits: Ensuring proportional representation of both classes in each subset.
- Performance Metrics Beyond Accuracy: Emphasis on F1-score, precision, and recall to better capture the model’s ability to handle the minority class.
- Class Weights: Incorporating weights into the loss function to penalize misclassification of minority class examples more heavily, which yielded small but meaningful improvements in recall.
Evaluation Tools
- Confusion Matrix: Provided a detailed breakdown of true positives, false positives, true negatives, and false negatives, allowing for targeted error analysis.
- ROC Curve & AUC: Illustrated the trade-offs between sensitivity and specificity, with the Area Under the Curve (AUC) offering a single summary measure of discrimination ability.
- SHAP & CM-SHAP: Shapley Additive Explanations (SHAP) were calculated to identify the most influential tokens driving the model’s predictions. Class-conditional SHAP (CM-SHAP) analysis was used to compare the feature importance patterns for correctly versus incorrectly classified posts.
- BERT (Bidirectional Encoder Representations from Transformers) was chosen for its strength in understanding context. Unlike traditional models that read text left-to-right or right-to-left, BERT processes entire sentences at once, capturing relationships between words regardless of position. This bidirectional nature allowed it to distinguish between phrases like “cheap skincare” as a cost-conscious choice versus as a sarcastic remark. Fine-tuning BERT on the dataset meant adjusting its weights so it could pick up patterns unique to financial and self-care language, making the model more domain-aware.
- SHAP (SHapley Additive exPlanations) acted as the model’s “interpreter.” While BERT can predict whether a post relates to self-care, SHAP explains why by assigning each word or token a contribution score. For instance, SHAP could show that terms like “budget-friendly,” “splurge,” or “stress relief” drove predictions. This interpretability bridged the gap between technical output and human insight, aligning with the study’s aim to connect machine learning results to real-world behavioral cues.
- The Confusion Matrix served as a reality check for model performance. It breaks predictions into four categories — True Positives, False Positives, True Negatives, and False Negatives. In human terms, it revealed not only how often the model was correct, but also how it made mistakes. For example, a high number of False Positives might indicate over-triggering on generic wellness terms, while many False Negatives could signal missed subtle cues. This breakdown informed both the evaluation of current performance and the roadmap for model improvement.
- By combining BERT’s deep contextual understanding, SHAP’s transparency, and the Confusion Matrix’s diagnostic power, the methods ensured that results were both technically sound and meaningfully interpretable.
Topic Modeling & Sentiment Analysis
6. Results
Model Performance
Confusion Matrix Highlights
ROC Curve
SHAP Insights
Topic & Sentiment Patterns
Data Leakage Concerns
- Near-duplicate posts across train, validation, and test sets due to reposts or quoted replies common in forums.
- Keyword labeling bias, where the presence of certain terms in both training and testing data allowed the model to perform well without developing deeper contextual understanding.
Error Analysis
Discussion
Limitations
Practical Implications
Future Research
Supplementary Materials
References
- Baker, S. R.; Bloom, N.; Davis, S. J. Measuring economic policy uncertainty. *The Quarterly Journal of Economics* 2016, 131(4), 1593–1636. [Google Scholar] [CrossRef]
- Bureau of Labor Statistics (BLS). Consumer Price Index Summary. 2022. Available online: https://www.bls.gov/news.release/cpi.nr0.htm.
- Grootendorst, M. BERTopic: Neural topic modeling with class-based TF-IDF. arXiv 2022. [Google Scholar] [CrossRef]
- Hu, Z.; Zhao, L.; Huang, S. Sentiment analysis in financial texts: A survey. *IEEE Access* 2019, 7, 131019–131033. [Google Scholar]
- Li, F.; Yu, L.; Huang, J. Financial sentiment analysis for risk prediction on internet financial forums. *Decision Support Systems* 2014, 65, 69–79. [Google Scholar]
- Lusardi, A.; Mitchell, O. S. The economic importance of financial literacy: Theory and evidence. *Journal of Economic Literature* 2014, 52(1), 5–44. [Google Scholar] [CrossRef] [PubMed]
- Global Wellness Institute. The Global Wellness Economy: Looking Beyond COVID. 2023. Available online: https://globalwellnessinstitute.org/.
- Anderson, R. Consumer behavior in inflationary times: Psychological responses to rising costs. Journal of Economic Psychology 2022, 88(1), 112–125. [Google Scholar]
- McKinsey; Company. Inflation and consumer behavior: What shoppers are doing differently. 2023. Available online: https://www.mckinsey.com/.
- contributors, Wikipedia. Lipstick effect. Wikipedia. 2024. Available online: https://en.wikipedia.org/wiki/Lipstick_effect.
- Dai, R.; Liu, Z.; Zhao, H.; Zhang, J. Understanding financial well-being and psychological stress through text analysis. Journal of Financial Counseling and Planning 2024, 35(1), 23–42. Available online: https://www.sciencedirect.com/science/article/pii/S2214109X24001335.
- Hill, S. E.; Rodeheffer, C.; Griskevicius, V.; Durante, K.; White, A. E. The lipstick effect: Women’s spending behavior during economic recession. International Journal of Research in Marketing 2024, 31(2), 115–123. Available online: https://en.wikipedia.org/wiki/Lipstick_effect.
- Vlada, A.; Zimmerman, L.; Martin, C. Economic insecurity, health behavior, and consumer confidence: A machine learning study during global inflation. Public Health Reports 2024, 139(2), 105–119. Available online: https://www.mdpi.com/1660-4601/22/1/26.
- Wang, Y.; Tse, Y.; Chan, K. Consumer uncertainty and the cost of living: A multidimensional analysis. Frontiers in Psychology 2023, 14, 10887512. Available online: https://pmc.ncbi.nlm.nih.gov/articles/PMC10887512/.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).