Submitted:
11 April 2026
Posted:
13 April 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Theoretical Background and Literature Review
2.1. Customer Equity and the Logic of Differentiated Investment
2.2. Behavioral Segmentation in Multi-Category Digital Retail
2.3. Machine Learning Approaches to Customer Segmentation
2.4. The Interpretability Imperative
3. Data and Methodology
3.1. Dataset and Analytical Context
3.2. Behavioral Feature Engineering: The RFM-B Framework
3.3. Segmentation Procedure and Cluster Validation
3.4. Ethical Considerations and Data Availability
4. Behavioral Landscape: Exploratory Analysis
4.1. The Platform Event Funnel and Monetary Distribution
4.2. Feature Correlation Structure
5. Segmentation Results: Five Customer Archetypes
5.1. Cluster Selection and Structural Validation
5.2. Segment Profiles: Behavioral Characteristics and Strategic Interpretation
5.2.1. Champions (n = 7,067; 11.0% of Buyers)
5.2.2. Loyal Customers (n = 14,092; 21.9% of Buyers)
5.2.3. Potential Loyalists (n = 3,577; 5.6% of Buyers)
5.2.4. At-Risk Customers (n = 25,490; 39.7% of Buyers)
5.2.5. Lost (n = 13,978; 21.8% of Buyers)
5.3. Segment × Category Conversion Analysis
6. Segment Recoverability and Behavioral Feature Attribution
6.1. Machine Learning Classification Performance
6.2. Behavioral Feature Attribution
7. Discussion
7.1. The RFM-B Framework: Design Logic and Operational Implications
7.2. Resource Allocation Logic Across the Five Segments
7.3. Personalization, Category Specificity, and the Consideration Touchpoint
7.4. Interpretability as an Organizational Prerequisite
8. Limitations and Future Research Directions
9. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| AI | Artificial Intelligence |
| AOV | Average Order Value |
| CLV | Customer Lifetime Value |
| CVR | Conversion Rate |
| ML | Machine Learning |
| PCA | Principal Component Analysis |
| RF | Random Forest |
| RFM | Recency–Frequency–Monetary |
| RFM-B | Recency–Frequency–Monetary–Behavioral (extended framework) |
| ROI | Return on Investment |
| SKU | Stock Keeping Unit |
References
- Wedel, M.; Kannan, P.K. Marketing Analytics for Data-Rich Environments. J. Mark. 2016, 80, 97–121. [Google Scholar] [CrossRef]
- Kannan, P.K. Digital Marketing: A Framework, Review and Research Agenda. Int. J. Res. Mark. 2017, 34, 22–45. [Google Scholar] [CrossRef]
- Rust, R.T.; Lemon, K.N.; Zeithaml, V.A. Return on Marketing: Using Customer Equity to Focus Marketing Strategy. J. Mark. 2004, 68, 109–127. [Google Scholar] [CrossRef]
- Gupta, S.; Lehmann, D.R. Customers as Assets. J. Interact. Mark. 2003, 17, 9–24. [Google Scholar] [CrossRef]
- Reinartz, W.J.; Kumar, V. On the Profitability of Long-Life Customers in a Noncontractual Setting. J. Mark. 2000, 64, 17–35. [Google Scholar] [CrossRef]
- Venkatesan, R.; Kumar, V. A Customer Lifetime Value Framework for Customer Selection and Resource Allocation Strategy. J. Mark. 2004, 68, 106–125. [Google Scholar] [CrossRef]
- Verhoef, P.C.; Kannan, P.K.; Inman, J.J. From Multi-Channel Retailing to Omni-Channel Retailing. J. Retail. 2015, 91, 174–181. [Google Scholar] [CrossRef]
- Chandra, S.; et al. Personalization in Personalized Marketing: Trends and Ways Forward. Psychol. Mark. 2022, 39, 1529–1562. [Google Scholar] [CrossRef]
- Saura, J.R. Algorithms in Digital Marketing. FIIB Bus. Rev. 2024, 13, 499–502. [Google Scholar] [CrossRef]
- Neslin, S.A.; et al. Challenges and Opportunities in Multichannel Customer Management. J. Serv. Res. 2006, 9, 95–112. [Google Scholar] [CrossRef]
- Lemon, K.N.; Verhoef, P.C. Understanding Customer Experience Throughout the Customer Journey. J. Mark. 2016, 80, 69–96. [Google Scholar] [CrossRef]
- Dwivedi, Y.K.; et al. Artificial Intelligence (AI): Multidisciplinary Perspectives. Int. J. Inf. Manag. 2021, 57, 101994. [Google Scholar] [CrossRef]
- McCarthy, D.M.; Fader, P.S. Customer-Based Corporate Valuation. J. Mark. Res. 2018, 55, 617–635. [Google Scholar] [CrossRef]
- Borle, S.; Singh, S.S.; Jain, D.C. Customer Lifetime Value Measurement. Manag. Sci. 2008, 54, 100–112. [Google Scholar] [CrossRef]
- Blattberg, R.C.; Malthouse, E.C.; Neslin, S.A. Customer Lifetime Value: Empirical Generalizations. J. Interact. Mark. 2009, 23, 157–168. [Google Scholar] [CrossRef]
- Bleier, A.; Eisenbeiss, M. Personalized Online Advertising Effectiveness. Mark. Sci. 2015, 34, 669–688. [Google Scholar] [CrossRef]
- Aguirre, E.; et al. Unraveling the Personalization Paradox. J. Retail. 2015, 91, 34–49. [Google Scholar] [CrossRef]
- Rodríguez-Priego, N.; et al. Perceived Customer Care and Privacy Protection Behavior. J. Retail. Consum. Serv. 2023, 72, 103284. [Google Scholar] [CrossRef]
- Canhoto, A.I.; Clear, F. Artificial Intelligence and Machine Learning as Business Tools. Bus. Horiz. 2020, 63, 183–193. [Google Scholar] [CrossRef]
- eCommerce Behavior Data from Multi-Category Store. Kaggle. Available online: https://www.kaggle.com/datasets/mkechinov/ecommerce-behavior-data-from-multi-category-store (accessed on 18 March 2026).
- Gupta, S.; Lehmann, D.R.; Stuart, J.A. Valuing Customers. J. Mark. Res. 2004, 41, 7–18. [Google Scholar] [CrossRef]
- Schumann, J.H.; Von Wangenheim, F.; Groene, N. Targeted Online Advertising. J. Mark. 2014, 78, 59–75. [Google Scholar] [CrossRef]
- Awad, N.F.; Krishnan, M.S. The Personalization Privacy Paradox. MIS Q. 2006, 30, 13–28. [Google Scholar] [CrossRef]
- Madane, Y.; Azeroual, M. Perceived Intrusiveness vs. Relevance. Digital 2025, 5, 63. [Google Scholar] [CrossRef]
- Gupta, S.; Lehmann, D.R.; Stuart, J.A. Valuing Customers. J. Mark. Res. 2004, 41, 7–18. [Google Scholar] [CrossRef]
- Zintl, T.; Houdret, A. Moving Towards Smarter Social Contracts? Mediterr. Politics 2024, 1–24. [Google Scholar]
- Martin, K.D.; Murphy, P.E. The Role of Data Privacy in Marketing. J. Acad. Mark. Sci. 2017, 45, 135–155. [Google Scholar] [CrossRef]
- Madane, Y.; Azeroual, M.; Saadaane, R. Enhancing Crowdfunding Campaign Success Prediction. In Smart City Applications; 2025; pp. 641–651. [Google Scholar]
- Youness, M.; Mohamed, A. Maximizing Marketing Impact Through Data-Driven Segmentation. J. Prof. Bus. Rev. 2025, 10, e05528. [Google Scholar]










| Characteristic | Detail |
|---|---|
| Observation window | 1 October – 30 November 2019 (61 days) |
| Raw interaction events | 4,635,837 |
| Event breakdown | View: 84.5% Add-to-cart: 11.2% Purchase: 4.3% |
| Raw unique users | 285,143 |
| Analytical sample (≥1 purchase) | 64,204 users |
| Product catalog | 168,295 SKUs across 8 top-level categories |
| Price range | USD 0.01–2,273.98 (median: USD 34.70) |
| Feature | Symbol | Operational definition | Observed range |
|---|---|---|---|
| Recency | R | Days elapsed since the user's most recent purchase to 30 November 2019 | [1, 61] |
| Frequency | F | Count of distinct purchase sessions in the observation window | [1, 80] |
| Monetary | M | Total USD spend accumulated across the observation window | [5, 4,000] |
| Conversion rate | CVR | Purchase events / total view events, computed at the user level | [0.01, 0.99] |
| Category breadth | N_cat | Number of distinct top-level product categories from which the user purchased (max. 8) | [1, 8] |
| Avg. order value | AOV | Monetary / Frequency — mean spend per purchase session | [8, 600] |
| Brand diversity | B_div | Count of distinct brands purchased across all sessions | [1, 20] |
| Segment | N (%) | Recency (days) | Frequency | Monetary (USD) | CVR | Cat. breadth | AOV (USD) | Brand div. |
|---|---|---|---|---|---|---|---|---|
| Champions | 7,067 (11.0%) | 3.2 | 21.6 | 910.0 | 0.57 | 4.91 | 45.8 | 4.1 |
| Loyal Customers | 14,092 (21.9%) | 10.8 | 10.8 | 440.2 | 0.39 | 3.50 | 45.2 | 2.7 |
| Potential Loyalists | 3,577 (5.6%) | 23.6 | 1.9 | 258.3 | 0.23 | 2.30 | 147.2 ★ | 1.8 |
| At-Risk | 25,490 (39.7%) | 24.0 | 4.3 | 154.6 | 0.20 | 1.97 | 39.2 | 1.6 |
| Lost | 13,978 (21.8%) | 59.4 | 1.3 | 46.3 | 0.07 | 1.50 | 38.8 | 1.2 |
| Segment | Precision | Recall | F1-score | Test support |
|---|---|---|---|---|
| Champions | 0.98 | 0.96 | 0.97 | 1,413 |
| Loyal Customers | 0.96 | 0.94 | 0.95 | 2,819 |
| Potential Loyalists | 0.98 | 0.98 | 0.98 | 715 |
| At-Risk | 0.97 | 0.98 | 0.97 | 5,098 |
| Lost | 0.98 | 0.98 | 0.98 | 2,796 |
| Overall test accuracy | — | — | 0.970 | 12,841 |
| 5-fold CV (mean ± σ) | — | — | 0.968 ± 0.001 | 64,204 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).