Submitted:
25 July 2024
Posted:
25 July 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Theoretical Background
2.1. Digitization, Digitalization and Digital Transformation
2.2. Data Monetization
2.3. Data, Data Structuring, and Data Analytics
2.4. Customer Analytics
2.5. Algorithms for Data Integration & Cleansing
3. Research Methodology
4. Results—Insights Into the Data Cleansing Initiative
4.1. Three Main Phases to Utilize Data for Customer Analytics
4.2. Data Cleanliness, Integration and Harmonization As A Key Challenge
4.3. Challenges Identification—Impeding Progress
4.4. Proposed Solution for Overcoming These Challenges
4.5. A framework for a Customer Master Data Cleansing Process
![]() |
5. Conclusions
| 1 | The terms "data cleaning" and "data cleansing" are often used interchangeably in data management and analytics. However, subtle distinctions can sometimes be made depending on context or specific industry practices. Data cleaning generally refers to the process of identifying and rectifying errors or inconsistencies in data to improve its quality. This includes tasks like correcting typographical errors, handling missing values, and removing duplicate records. Data cleaning typically involves surface-level tasks that address immediate, apparent issues in the data. Data cleansing is a more comprehensive process that involves ensuring the data is not only free from errors but also accurate, consistent, and usable for its intended purpose. This might include verifying data against external sources, ensuring data integrity, and standardizing formats across datasets. Data cleansing encompasses a broader range of activities, often with a deeper focus on the overall quality and reliability of the data. Thus, we use the term data cleansing in the paper. |
References
- Danuso, A.; Giones, F.; and da Silva, E.R. The digital transformation of industrial players. Bus. Horiz., 2022, 65(3), 341-349. [CrossRef]
- Wixom, B.H.; Piccoli, G.; and Rodriguez, J. Fast-track data monetization with strategic data assets. MIT Sloan Manag. Rev., 2021, 62(4), 1-4.
- Zhang, X.; Yue, W.T.; Yu, Y.; Zhang, X. (2023). How to monetize data: An economic analysis of data monetization strategies under competition. Decis. Support Syst., 173, 114012. [CrossRef]
- Top Trends in Data and Analytics 2024. Available Online: https://www.gartner.com/smarterwithgartner/gartner-top-10-data-and-analytics-trends-for-2021 (accessed 24 July 2024).
- How to monetize your customer data. Available Online: https://www.gartner.com/smarterwithgartner/how-to-monetize-your-customer-data/ (accessed 24 July 2024).
- How to create a business case for data quality improvement. Available Online: https://www.gartner.com/smarterwithgartner/how-to-create-a-business-case-for-data-quality-improvement/ (accessed 24 July 2024).
- Ritala, P.; Keränen, J.; Fishburn, J.; Ruokonen, M. Selling and monetizing data in B2B markets: Four data-driven value propositions. Technovation, 2024, 130, 102935. [CrossRef]
- Najjar, M.S.; Kettinger, W.J. Data Monetization: Lessons from a Retailer's Journey. MIS Q. Exec., 2013, 12(4), 213-225.
- Saarikko, T.; Westergren, U.H.; Blomquist, T. Digital transformation: Five recommendations for the digitally conscious firm. Bus. Horiz., 2020, 63(6), 825–839. [CrossRef]
- Kokkinou, A.; van Kollenburg, T.; Mandemakers, A.; Hopstaken, H.; and van Elderen, J. (2023). The data analytic capability wheel: an implementation framework for digitalization. In 36th Bled eConference: Digital Economy and Society: the Balancing Act for Digital Innovation in Times of Instability.
- Legner, C.; Eymann, T.; Heß, T.; Matt, C.; Böhmann, T.; Drews, P.; Mädche, A.; Urbach, N.; Ahlemann, F. Digitalization: opportunity and challenge for the business and information systems engineering community. Bus. Inform. Syst. Eng., 2017, 59(4), 301–308. [CrossRef]
- Tilson, D.; Lyytinen, K.; and Sørensen, C. Research commentary—Digital infrastructures: The missing IS research agenda. Inf. Syst., 2010, 21(4), 748-759. [CrossRef]
- Brynjolfsson, E.; McAfee, A. The second machine age: Work, progress, and prosperity in a time of brilliant technologies. 2014, WW Norton & Company.
- Machado, C.G.; Winroth, M.; Carlsson, D.; Almström, P.; Centerholt, V.; and Hallin, M.C.. Industry 4.0 readiness in manufacturing companies: challenges and enablers towards increased digitalization. Procedia CIRP, 2019, 81, 1113–1118. [CrossRef]
- Ofulue, J.; Benyoucef, M. Data monetization: insights from a technology-enabled literature review and research agenda. 2022, Manag. Rev. Q., 1-45. [CrossRef]
- Wixom, B.H.; and Ross, J.W. How to monetize your data. MIT Sloan Manag. Rev., 2017, 58(3), 10–13.
- Faroukhi, A.Z.; El Alaoui, I.; Gahi, Y.; Amine, A.. Big data monetization throughout Big Data Value Chain: a comprehensive review. 2020, J. of Big Data, 7, 1-22. [CrossRef]
- Wixom, B.; Yen, B.; Rellich, M. Maximizing value from business analytics. MIT Sloan Manag. Rev., 2013, 12, 111–123.
- Kietzmann, J.; Paschen, J.; Treen, E.. Artificial intelligence in advertising: How marketers can leverage artificial intelligence along the consumer journey. J. Advert., 2018, 58(3), 263-267.
- Sivarajah, U.; Kamal, M.M.; Irani, Z.; Weerakkody, V.. Critical analysis of Big Data challenges and analytical methods. J. Bus. Res., 2017, 70, 263-286. [CrossRef]
- Sanders, N.R. How to use big data to drive your supply chain. Calif. Manage. Rev., 2016, 58(3), 26-48. [CrossRef]
- Tabesh, P.; Mousavidin, E., Hasani, S.. Implementing big data strategies: A managerial perspective. Bus. Horiz., 2019, 62(3), 347-358. [CrossRef]
- Erevelles, S.; Fukawa, N.; Swayne, L. Big Data consumer analytics and the transformation of marketing. J. Bus. Res., 2016, 69(2), 897-904. [CrossRef]
- Hossain, M.A.; Akter, S.; Yanamandram, V.; and Wamba, S.F. Data-driven market effectiveness: The role of a sustained customer analytics capability in business operations. Technol. Forecast. Soc. Change, 2023, 194, 122745. [CrossRef]
- Velcu-Laitinen, O.; Yigitbasioglu, O. The Use of Dashboards in Performance Management: Evidence from Sales Managers. Int. J. Digit. Account. Res., 2012, 12, 36–58. [CrossRef]
- What Is Customer Analytics? Available online: https://www.forbes.com/advisor/business/customer-analytics/ (accessed 24 July 2024).
- Chen, H.; Chiang, R.H.; Storey, V.C. (2012). Business intelligence and analytics: From big data to big impact. MIS Q., 2012, 36(4), 1165-1188. [CrossRef]
- Dover, C. How dashboards can change your culture. Strat. Fin., 2004, 86(4), 42.
- Pappas, L. M.; Whitman, L.. Riding the technology wave: Effective dashboard data visualization. In Lecture Notes in Computer Science, 2011, (pp. 249–258).
- Watson, H. J.; Goodhue, D.L.; Wixom, B.H. The benefits of data warehousing: why some organizations realize exceptional payoffs. Inform. Manage, 2002, 39(6), 491–502. [CrossRef]
- Gudivada,V.N., Apon, A., & Ding, J. Data Quality Considerations for Big Data and Machine Learning: Going Beyond Data Cleaning and Transformations. Intern. J. Advan. Soft.. 2017, 10(1), 1-20.
- Renear, A. H.; Sacchi, S.; Wickett, K.M. Definitions of dataset in the scientific and technical literature. Proceedings of the Association for Information Science and Technology, 2010, 47(1), 1–4.
- Lee, G. Y., Alzamil, L., Doskenov, B., & Termehchy, A. (2021). A survey on data cleaning Methods for Improved Machine learning model Performance. arXiv (Cornell University).
- McGilvray, D. Executing Data Quality Projects: Ten Steps to Quality Data and Trusted Information TM. 2008, Academic Press.
- Wedekind, H. Bestandsdaten, Bewegungsdaten, Stammdaten. Lexikon der Wirtschaftsinformatik. 1997, Springer, Berlin, p. 61.
- Arnold, J., & Hammwöhner, R. Data Integration and Data Cleaning: Solutions for Improving Data Quality. Springer, 2022.
- Mahdavi, M., Neutatz, F., Visengeriyeva, L., & Abedjan, Z. (2019). Towards automated data cleaning workflows. Mach. Learn., 15, 16.
- Ridzuan, F., and Zainon, W. A review on data cleansing methods for big data. Procedia Comput. Sci., 2019, 161, 731–738. [CrossRef]
- Eisenhardt, K.M. Building Theories from Case Study Research. Acad. Manage. Rev., 1989, 14(4), 532-550.
- Langley, A. (1999). Strategies for theorizing from process data. Acad. Manage. Rev., 24(4), 691-710.
- Langley, A.; Smallman, C.; Tsoukas, H.; Van De Ven, A.H. Process Studies of Change in Organization and Management: unveiling temporality, activity, and flow. Acad. Manage. J., 2013, 56(1), 1–13. [CrossRef]


| Key Research Theme | Summary |
|---|---|
| Digitization, Digitalization, and Digital Transformation | Research explores the concepts of digitization, digitalization, and digital transformation. Digitization involves converting analog signals into digital ones, separating data from its medium. Digitalization involves integrating digital technologies into organizational processes to create new value opportunities. Digital transformation immerses the entire enterprise in digital methods, extending beyond processes and data to impact operations, business models, and competencies. |
| Data Monetization | Data monetization is integral to discussions on digitization, digitalization, and digital transformation. It involves deriving economic benefits from available data sources through direct or indirect methods. Direct methods include data sales, licensing, and participation in data marketplaces. Indirect methods involve utilizing data to optimize internal operations, refine products, or enhance services. |
| Data Structuring and Data Analytics | Data cleaning, preparation, and harmonization are crucial for effective data monetization. The process of big data analytics comprises four phases: turning data into insights, transforming insights into decisions, translating decisions into actions, and generating data points for future decision-making. Companies should progress through four stages of maturity in developing data analytics capabilities: data structuring, data availability, basic analytics, and advanced analytics. |
| Customer Analytics | Data cleaning is essential for customer analytics, enabling businesses to make data-driven decisions and enhance customer engagement, marketing effectiveness, and profitability. It involves examining and interpreting customer data to understand and predict behavior, preferences, and trends. Customer analytics empowers sales representatives and managers to increase revenue and improve customer satisfaction and loyalty. |
| Algorithms for Data Integration & Cleaning | Various algorithms facilitate data integration and cleaning, ensuring data quality, consistency, and reliability for analytics and decision-making. These algorithms include entity resolution, schema matching, data fusion, ontology-based integration, missing value imputation, outlier detection and removal, normalization and standardization, text cleaning, and data transformation. Advanced techniques such as deep learning, reinforcement learning, and graph-based algorithms enhance data cleaning effectiveness. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
