Introduction
In the evolving discourse on cultural heritage preservation and dissemination, museum cultural creative products (MCCPs) have emerged as vital instruments that bridge traditional heritage with contemporary audiences. Rooted in historical artifacts, symbols, and narratives, MCCPs serve not merely as commodities but as recontextualized expressions of heritage values. Understanding how consumers perceive and engage with these products offers critical insight into the effectiveness of heritage communication, cultural identity reinforcement, and the societal resonance of historical narratives. However, in recent years, cultural heritage management institutions worldwide have faced increasing financial pressures (Borin & Donato, 2023; Stamatoudi & Roussos, 2024; Volanakis et al., 2024). With reduced government funding for cultural preservation projects, many museums and cultural institutions have had to turn to market-oriented approaches to ensure their long-term operational sustainability (Aroles & Morrell, 2024; Porter, 2024). Traditional management models of cultural heritage primarily rely on government grants, donations, and ticket revenues; however, the instability of these funding sources makes it challenging to support the long-term maintenance and innovative development of cultural heritage (Acampa & Parisi, 2021; Barile & Saviano, 2015). Therefore, establishing a sustainable business model for cultural heritage is essential not only for preserving history but also for ensuring that cultural institutions can continue to thrive and remain viable. In recent years, the cultural and creative industries have been regarded as a key pathway to addressing the sustainable development of cultural heritage (Gao et al., 2023; Imperiale et al., 2021). The cultural and creative industries not only transform cultural resources into marketable products but also enhance public interest and recognition of cultural heritage through innovative forms (Howkins, 2018). Among these, museum cultural and creative products (MCCPs), as an essential component of the cultural and creative industries, have become a vital means for major museums worldwide to enhance cultural dissemination and expand revenue streams (Haoyuan Cheng et al., 2023; Gao & Xinquan, 2024; Li et al., 2021). For example, world-renowned museums such as the Palace Museum, the British Museum, and the Louvre have successfully commercialized and socialized cultural content through the development of high-quality MCCPs (Chen et al., 2020; Li & Li, 2022; Tsang et al., 2022; Tu et al., 2019). By blending traditional culture with contemporary design, MCCPs bridge the gap between the past and present, foster cultural engagement, meet the evolving demands of today’s consumers (Hui Cheng, Xu Sun, et al., 2023; Cheng et al., 2024), enhance the economic value of cultural heritage, and facilitate its reinterpretation and global dissemination (Shangguan, 2024).
However, despite the rapid development of the MCCP industry, its market growth faces multiple challenges from the consumer perspective. First, the issue of product homogenization has become increasingly prominent (Cheng et al., 2024; Li et al., 2021; Tu et al., 2019). Some MCCPs overly rely on simple reproductions of traditional cultural symbols, lacking creative innovation, which diminishes consumer interest and weakens their market appeal (Carvajal Pérez et al., 2020; L. Liu, 2024). Consumers have also voiced numerous complaints about the homogenization of MCCPs, which not only reduces their purchase intentions and satisfaction but also limits the further development of the MCCP market (H. Li et al., 2024; Xu & Chen, 2024). Therefore, enhancing the uniqueness and attractiveness of products through innovative design while preserving cultural value has become a critical issue for the MCCP industry. Second, when evaluating MCCPs, consumers consider not only the cultural value of the products but also multiple factors such as aesthetics, functionality, and consumption experience (Cheng et al., 2024). However, research on consumer perception in a broader context has yielded fragmented and often contradictory findings (Blut et al., 2023), let alone studies specifically focusing on consumers of cultural creative products in museums.
Additionally, consumers’ purchasing decisions are influenced not only by the products themselves but also by the overall experience of the purchasing process, including brand interaction, service quality, and price perception (Kalyva et al., 2024; Levrini & Jeffman dos Santos, 2021; Pina & Dias, 2021; Wang et al., 2021). Optimizing the consumption experience of MCCPs to enhance consumer satisfaction remains a significant challenge for the cultural and creative industries. These issues suggest that the market success of MCCPs depends not only on the transmission of cultural content but also on multiple factors, including product design and the overall consumption experience. Therefore, in-depth research on how consumers evaluate MCCPs and the impact of these evaluations on satisfaction is crucial for promoting the sustainable development of the MCCP market.
Although the existing studies mentioned above have explored the development of the MCCP industry, several research gaps remain. First, there is a lack of systematic research on the consumer evaluation mechanism of MCCPs. Existing studies primarily focus on the cultural value and market promotion of MCCPs while paying less attention to the multiple factors that consumers consider when evaluating MCCPs. Second, traditional consumer behavior research emphasizes functionality, price, and brand influence; however, the impact of creativity, aesthetic experience, and cultural attributes on consumer satisfaction in the context of MCCPs as cultural consumer goods has not been fully explored. Furthermore, most current MCCP research relies on questionnaire surveys, failing to fully utilize consumer-generated online review data to uncover consumers’ actual needs and preferences.
To address these research gaps, this study aims to develop a data-driven MCCP evaluation framework that identifies key factors influencing consumer satisfaction. By systematically analyzing consumers' multidimensional evaluations of MCCPs, this study will develop a comprehensive evaluation framework to reveal the core factors that affect consumer satisfaction. At the same time, this study will explore the mechanism through which product perception influences consumption experience and satisfaction. Consumers’ perceptions of MCCPs encompass not only cultural value but also multiple dimensions such as aesthetics, functionality, and creativity. How these perceptions affect final satisfaction through the mediating role of consumption experience is a key focus of this study. This research will examine the mechanism of how product perception influences consumption experience and consumer satisfaction, providing a theoretical basis for optimizing MCCP design and marketing strategies. Additionally, this study will apply the Diffusion of Innovation Theory to explain how consumers adopt MCCPs and explore the applicability of this theory in the context of MCCPs. To achieve these objectives, this study utilizes consumer-generated comments from an e-commerce platform, employing text mining, sentiment analysis, and structural equation modeling (PLS-SEM) to extract MCCP evaluation dimensions in a data-driven manner and validate their impact on consumer satisfaction.
The structure of this study is as follows: Section 2 provides a literature review, revisiting MCCP consumer behavior research, the Diffusion of Innovation Theory, and related consumer satisfaction theories to provide theoretical support for the research hypotheses. Section 3 outlines the research methodology, introducing data sources, text mining methods, sentiment analysis, and the construction of the PLS-SEM model. Section 4 presents the research results, reporting the core dimensions of consumer evaluations and their impact on consumption experience and consumer satisfaction. Section 5 discusses the findings and contributions, exploring the MCCP evaluation mechanism, deepening the Diffusion of Innovation Theory, and proposing management and policy recommendations. Section 6 concludes the study, summarizing its contributions, pointing out limitations, and suggesting directions for future research.
Literature Review
Consumer behavior is a multi-faceted construct shaped by cognitive, emotional, and situational factors. Among these, three critical dimensions—product value perception, consumption experience, and consumer satisfaction—play a fundamental role in influencing purchasing decisions and post-consumption evaluations. Product value perception refers to how consumers assess a product’s worth, taking into account both its functional and emotional attributes. Consumption experience encompasses the sensory, cognitive, and affective aspects of engaging with a product or service. Consumer satisfaction determines whether expectations are met, exceeded, or disappointed, influencing future behavior. Understanding the interplay among these factors is essential for advancing MCCP consumer behavior research and cultural heritage marketing strategies. The following sections critically examine these dimensions, drawing on theoretical and empirical insights.
Consumer Satisfaction
Consumer satisfaction has long been recognized as a crucial factor in influencing consumer behavior, brand loyalty, and repeat purchases. Traditionally, scholars argue that consumer satisfaction is primarily driven by the perception of product attributes and the quality of the consumption experience (Oliver, 1980). In the context of heritage tourism and MCCPs, these factors, together with psychological and emotional engagement, play a crucial role in determining whether visitors feel a sense of fulfillment and connection to their purchases because scholars discover that museum visitors often evaluate MCCPs based on their authenticity, quality, uniqueness, and relevance to the heritage experience (McKercher & du Cros, 2002; Pine & Gilmore, 2011). Perceived authenticity is particularly crucial, as it significantly impacts visitor satisfaction in heritage tourism (Chhabra et al., 2003). However, authenticity is often a subjective perception rather than an objective reality, meaning that visitors’ expectations can shape their satisfaction regardless of the product’s actual historical accuracy (Cohen, 1988). Put differently, if a product successfully reflects the cultural and historical significance of a site, consumers are more likely to find it valuable and, consequently, feel satisfied with their purchase.
Additionally, the shopping or consumption experience itself—such as engaging with knowledgeable museum staff, encountering interactive displays, or purchasing well-packaged products—can enhance overall satisfaction (Holbrook & Hirschman, 1982; Parasuraman et al., 1988). According to the expectation-disconfirmation model (Anderson & Sullivan, 1993; Oliver, 2014), consumers derive satisfaction when their experience meets or exceeds their pre-existing expectations. Empirical studies in heritage tourism confirm this relationship, demonstrating that visitor satisfaction is more dependent on the alignment of expectations than on the objective quality of the product (Wen et al., 2024). In the museum context, a well-crafted and meaningful MCCP, combined with a positive purchasing experience, is likely to result in high consumer satisfaction. In summary, research strongly supports the notion that product perception and the consumption experience are key determinants of consumer satisfaction.
However, an opposing viewpoint argues that satisfaction does not always stem directly from product perception or experience alone. Consumer personality traits also play a crucial role, as individuals with higher emotional sensitivity may report satisfaction differently than those with more analytical tendencies (Ladhari, 2007). Additionally, emotional memory can distort post-experience evaluations, meaning that a consumer’s mood or sentimental connection to a purchase may override their actual experience (Hosany & Gilbert, 2009). According to Cognitive Dissonance Theory (Festinger, 1957), consumers often experience psychological discomfort if they feel uncertain about their purchase decision. In such cases, they may rationalize their choice by convincing themselves that a product is more valuable than it is, leading to an inflated sense of satisfaction. For example, a museum visitor who buys an expensive yet low-quality replica artifact may justify the purchase by emphasizing its cultural significance despite the product itself failing to meet expectations.
Additionally, external factors can override perceived product quality and consumption experience, leading to satisfaction or dissatisfaction regardless of a product’s actual attributes. Price sensitivity plays a significant role in consumer evaluations, as visitors may feel dissatisfied even with a high-quality product if they perceive it as overpriced (Thaler, 1985; Zeithaml, 1988). Similarly, social influence can shape consumer satisfaction; tourists may purchase a heritage product due to peer pressure or because it is endorsed by experts rather than because they genuinely appreciate it (Solomon, 2020). In such cases, satisfaction is not necessarily linked to product perception but rather to external validation and expectations.
Another challenge to the traditional product perception-experience model comes from the debate between experience and outcome. Some scholars suggest that consumers remember the outcome of their consumption experience rather than the process itself (Kahneman, 2011; Mittal & Kamakura, 2001). In the context of MCCPs, a visitor might have a frustrating shopping experience but later develop substantial sentimental value toward their purchased item, leading to long-term satisfaction. This perspective suggests that satisfaction can be delayed and re-evaluated over time rather than being an immediate response to product attributes or the shopping experience.
Furthermore, behavioral economics and consumer psychology offer alternative explanations for satisfaction that challenge purely rational models of consumer behavior. Kahneman and Tversky (1979) and Schwarz and Clore (1983) argue that consumer satisfaction is often driven by emotional biases rather than logical assessments of product quality. The dual-system theory of Kahneman (2011) further supports this, suggesting that consumer evaluations rely on a mix of fast, emotional intuition (System 1) and slower, analytical reasoning (System 2), with the former often dominating satisfaction judgments. For instance, in a museum scenario, a positive emotional interaction with museum staff may overshadow the actual quality of a product, making a consumer feel satisfied even if the product itself is unremarkable. Conversely, a visitor might leave dissatisfied despite a great product and experience if they feel the product was too expensive or if they were in a bad mood during the purchase.
Consumption Experience
Consumption experience is a fundamental aspect of consumer behavior that encompasses the emotional, sensory, cognitive, and social dimensions of engaging with a product or service. Unlike the traditional utilitarian perspective, which views consumption primarily as a means of fulfilling functional needs, contemporary theories emphasize the experiential nature of consumption, where individuals seek pleasure, meaning, and engagement beyond mere utility. The shift from a rationalist framework to an experiential perspective has been primarily influenced by the work of Holbrook (1999), who introduced the experiential view of consumption, arguing that consumers derive value not only from product attributes but also from the subjective enjoyment associated with the consumption process. This perspective aligns with later research on experiential consumption, which suggests that consumers derive hedonic value from sensory and emotional stimuli associated with consumption (Arnould & Price, 1993; Gentile et al., 2007). Contributions from scholars across various disciplines have marked the evolution of consumption experience as an academic field. Pine and Gilmore (2011) developed the Experience Economy framework, which posited that businesses should focus on designing memorable experiences rather than simply offering goods or services. Carù and Cova (2003) examined immersion in consumption experiences, arguing that deep engagement with a product or service fosters a stronger emotional and psychological connection. It aligns with research on emotional attachment in consumption experiences, which posits that consumers form strong affective bonds with brands, products, and services when these align with personal identity and self-concept (Debenedetti & Chaney, 2024; Thomson et al., 2005; J. Zhang et al., 2024).
In the context of heritage tourism, the consumption experience takes on a distinct form, as tourists engage with cultural and historical sites that evoke a sense of authenticity, nostalgia, and emotional connection. Unlike conventional tourism, which often prioritizes leisure and entertainment, heritage tourism involves an interpretive and educational dimension, where visitors seek to establish a deeper connection with historical narratives and cultural identity. McKercher and du Cros (2002), Lin et al. (2024), and Jv et al. (2024) emphasized the role of cultural authenticity in shaping visitor experiences, arguing that tourists’ perceptions of a site's historical integrity influence their overall satisfaction. In addition, Rasoolimanesh and Lu (2024) and Steriopoulos et al. (2024) argued that heritage tourists are often motivated by a personal or emotional connection to a site, which enhances their overall consumption experience. Recent research on nostalgia in tourism experiences further supports this perspective, demonstrating that visitors who experience nostalgia during heritage tourism activities exhibit higher satisfaction and stronger emotional attachment to the site (Cho et al., 2019). It suggests that heritage tourism experiences are not merely about observing historical artifacts but involve active participation in meaning-making and emotional engagement.
Within the broader framework of heritage tourism, MCCPs allow visitors to engage with cultural narratives beyond the confines of the museum space. A recent study highlights that MCCPs play a significant role in enhancing consumers' cultural interest and engagement, particularly when they effectively represent historical and artistic elements (Ye & Yu, 2024). Thus, the consumption experience of MCCPs is shaped by several interrelated factors, including perceived authenticity, aesthetic appeal, emotional connection, and multi-sensory engagement (Huang et al., 2023). However, as observed by Rickly et al. (2023), Huang et al. (2023), and Jin and Hwang (2024), authenticity is often socially constructed rather than objectively determined, meaning that visitors’ expectations and pre-existing cultural knowledge influence their interpretation of MCCPs. For example, an MCCP that closely resembles an ancient artifact may be perceived as authentic and meaningful by some consumers. In contrast, others may view it as a mere commercial reproduction with little intrinsic value. Beyond authenticity, aesthetic and design attributes significantly impact the consumption experience of MCCPs. Contemporary research highlights that the success of MCCP design depends on the creative integration of traditional cultural elements, modern technologies, fashion trends, and aesthetic preferences (Capece et al., 2024; Chen et al., 2024; Furferi et al., 2024; H. Li et al., 2024; Tussyadiah et al., 2018; Zheng et al., 2024).
However, recent studies have questioned the universal applicability of experiential advantage. For instance, Weingarten and Goodman (2021) conducted a meta-analysis revealing that the happiness derived from experiential purchases compared to material ones is moderated by various factors, suggesting that the experiential advantage may not be as robust as previously thought. Furthermore, they challenged the clear-cut distinction between experiential and material purchases, arguing that the interplay between these types of consumption is more complex and context-dependent than earlier theories have proposed (Weingarten et al., 2023). These critiques underscore the need for a more nuanced understanding of consumption experiences, taking into account the situational and individual differences that influence consumer preferences.
Based on the aforementioned review, we put forward the hypothesis that Consumption Experience significantly impacts Consumer Satisfaction.
Product Value Perception
Understanding how consumers perceive a product’s value is crucial in determining their engagement and overall satisfaction. Consumer Value Perception Theory (CVPT) offers a framework for examining the various dimensions through which consumers assess products and services (Sánchez-Fernández & Iniesta-Bonillo, 2007). Early studies, particularly those of Holbrook (1999), highlight multiple dimensions of value, including efficiency, excellence, aesthetics, and play, which contribute to how consumers perceive and interact with products. Zeithaml (1988) further emphasized that perceived value is not just a function of price but also a balance between quality, emotional engagement, and functional utility.
Recent literature has expanded the concept of product perception to reflect its impact on both consumption experience and consumer satisfaction (Morar, 2013; Tasci, 2016). Within the context of heritage tourism and MCCPs, consumer perceptions are shaped by factors such as design, authenticity, packaging, and functionality, all of which influence both the experience of products and their contribution to satisfaction (Cheng et al., 2024; Guo et al., 2023). To explore this further, this section examines two major pathways: (1) How product perception influences consumption experience and (2) How product perception directly impacts consumer satisfaction.
The Impact of Product Perception on Consumption Experience
Consumer perception of product attributes—such as aesthetics, creativity, functionality, design, cultural authenticity, symbolic meaning, and emotional appeal—not only shapes their evaluation of a product but also significantly influences the shopping experience. In the context of MCCPs, these attributes impact consumer engagement, interpretation, and experience during the purchasing process, as well as their interactions with museum retail environments, product displays, and brand storytelling (Guo et al., 2023; Huang et al., 2021; Li & Li, 2022; Li et al., 2021). Existing research suggests that shopping is not merely a transactional activity but an experiential journey influenced by sensory stimulation, emotions, and contextual factors (Pine & Gilmore, 2011). A well-perceived product can enhance this journey, making the shopping experience more engaging, immersive, and emotionally rewarding (Schmitt, 1999). Consumers who are drawn to the artistic and historical significance of an MCCP may perceive their shopping experience as more meaningful as it aligns with their cultural identity and personal values (Tu et al., 2019). It aligns with self-congruence theory, which posits that consumers derive greater satisfaction from shopping experiences that reinforce their self-concept (Sirgy, 1982). Beyond psychological engagement, product perception also influences behavioral aspects of the shopping experience. Consumers who perceive an MCCP as authentic and aesthetically refined are more likely to engage with product demonstrations, interactive store displays, and knowledgeable staff, all of which enhance shopping enjoyment and memorability (Guo et al., 2023).
Additionally, packaging and storytelling elements within a retail environment shape consumer mood, engagement levels, and willingness to explore products (Duarte et al., 2024; Yuxin et al., 2024). However, some scholars argue that product perception alone may not always determine the quality of the shopping experience. Weingarten et al. (2023) highlight that external variables, such as store atmosphere, customer service, and price positioning, can override product attributes in shaping consumers' emotional states. Additionally, some studies suggest that consumers may rationalize their shopping experiences post-purchase, meaning that their satisfaction with the shopping trip may not directly correlate with their initial product perception (Lin et al., 2022; Park et al., 2015). Despite these counterarguments, existing literature predominantly supports the idea that product perception—through aesthetics, authenticity, symbolic meaning, and cultural resonance—significantly influences the shopping experience by shaping emotional engagement, cognitive processing, and interaction levels. Given the strong theoretical and empirical support for the role of product perception in shaping how consumers engage with and experience products, this study proposes the following hypothesis: Product Perception significantly impacts Consumption Experience.
Product Perception in Shaping Consumer Satisfaction
Consumer satisfaction with MCCPs is influenced by multiple dimensions of product value perception, including creativity, aesthetics, functionality, and cultural authenticity (Baghirov & Zhang, 2024; Huang et al., 2023; Kumar et al., 2025; Lee et al., 2024; Tu et al., 2019). Consumers often favor MCCPs that incorporate innovation, as creativity enhances uniqueness and fosters emotional attachment (Hui Cheng, Xu Sun, et al., 2023; Cheng et al., 2024). Similarly, aesthetic appeal plays a crucial role in satisfaction, with consumers gravitating toward products that exhibit refined craftsmanship and artistic excellence (Rejón-Guardia, 2024; Sonderegger et al., 2014). In addition to design attributes, functionality is increasingly valued as consumers appreciate MCCPs that offer practical applications in conjunction with cultural significance (H. Li et al., 2024). Cultural authenticity further strengthens consumer satisfaction by reinforcing a sense of historical connection and meaning as individuals seek products that align with their perceptions of genuine heritage (Jin & Hwang, 2024; Kreuzbauer & Keller, 2017). While these factors are widely acknowledged as key drivers of consumer satisfaction, some scholars argue that satisfaction does not always stem directly from product attributes alone. Research suggests that external influences, such as social expectations, branding, and psychological biases, may shape consumer perceptions more than inherent product qualities (Ordabayeva et al., 2022; Włodarska et al., 2019). For instance, consumers may rationalize dissatisfaction post-purchase, convincing themselves that a product is of higher value due to brand prestige or social desirability (Akdeniz et al., 2013; Sansome et al., 2024). Additionally, studies indicate that price sensitivity can override product value perception, leading to dissatisfaction when consumers perceive MCCPs as overpriced, even if they are high in creativity and authenticity (Cao & Wang, 2024; Kalyva et al., 2024). The cognitive dissonance effect further complicates this relationship, as individuals who regret their purchases may later attempt to justify them by focusing on select product attributes rather than forming an objective assessment of overall value (Jog et al., 2024; Xu & Jin, 2022). In summary, despite these counterarguments, research overwhelmingly supports the view that consumers who have a positive perception of MCCPs are more likely to experience higher levels of satisfaction. Given this relationship, it is reasonable to hypothesize that Product Perception significantly impacts Consumer Satisfaction.
Summary
Building on the previous discussion of the interrelationship among the dimensions, we summarize the proposed hypothesis in
Table 1 and the conceptual model Framework in
Figure 1.
Methodology
This study employs an LLM-assisted hybrid approach, integrating text mining, sentiment analysis, and Partial Least Squares Structural Equation Modeling (PLS-SEM) to analyze consumer perceptions of creativity in MCCPs. The methodology consists of three main phases: (1) data collection and preprocessing, (2) topic modeling and clustering using LDA + SVM and BERT + K-Means with LLM-enhanced semantic analysis, and (3) validation of extracted MCCP dimensions via PLS-SEM.
Data Collection
The Palace Museum is widely recognized as the benchmark for MCCP innovation in China, setting standards that many other museums follow. As one of China’s most significant cultural heritage sites, it holds a central position in the country’s historical and artistic legacy, making it an ideal case for studying consumer perceptions of MCCPs. Notably, in 2024, the successful World Heritage inscription of Beijing’s Central Axis, where the Palace Museum serves as the core landmark, has further reinforced its cultural and symbolic importance. Given its unparalleled influence in heritage preservation and cultural commercialization, the Palace Museum provides a unique and representative context for analyzing how consumers engage with MCCPs in a market-driven heritage landscape. Furthermore, with China emerging as a global leader in digital commerce, an increasing number of museums have established official e-commerce platforms, such as Tmall flagship stores, to enhance accessibility to their MCCPs. These platforms, which synchronize product availability with physical museum stores and offer nationwide free shipping, have become a primary purchasing channel for MCCPs in China. Industry reports indicate that online transactions now constitute a substantial share of total MCCP sales, as virtual museum visits have surpassed physical attendance for the first time, with Tmall and Taobao museum flagship stores alone recording 1.6 billion visits in a single year—1.5 times the total nationwide museum foot traffic
1—underscoring the growing influence of e-commerce on consumer engagement with MCCPs.
This study does not rely on in-person museum visits to collect consumer feedback for two key reasons. First, direct interviews with consumers could introduce social desirability bias, where respondents may feel pressured to provide more favorable responses when interacting face-to-face. Second, on-site surveys present logistical challenges, requiring significant resources while yielding a relatively limited volume of textual data. In contrast, e-commerce platforms offer a vast and readily available repository of consumer-generated reviews, enabling large-scale text mining and quantitative analysis. Another advantage of leveraging e-commerce reviews over social media discussions is the authenticity of the feedback. Each review is tied to an actual purchase, ensuring that the dataset reflects real consumer experiences rather than speculative or promotional content.
Data Cleaning
To ensure that the dataset contains only meaningful consumer insights, a systematic data-cleaning process will be applied. This process will involve removing non-informative, redundant, or irrelevant reviews while retaining those that provide explicit evaluations of MCCP attributes. By implementing these cleaning criteria, the study will ensure that the final dataset is optimized for subsequent text mining and structural modeling. The detailed cleaning criteria and their impact on the dataset size will be reported in the Results section.
Themes Identification
This study will employ a two-stage hybrid approach, integrating LDA+SVM and BERT+K-Means, to systematically extract consumer-perceived dimensions of MCCPs. When integrating results from LDA+SVM and BERT+K-Means, themes identified by both methods will be merged, while those detected in only one method will be retained for empirical validation in PLS-SEM. Finally, to ensure the accuracy of dimensions and indicators, a text analysis of the comments will be conducted in SPSSAU by identifying high-frequency keywords.
LDA and SVM
In the first stage, Latent Dirichlet Allocation (LDA), a probabilistic generative model commonly used for topic modeling, will be employed to identify latent topics within the consumer review corpus in DIKW (7.83). LDA is particularly suitable for this study as it identifies underlying themes in large text datasets by analyzing word co-occurrence patterns, making it an effective tool for extracting consumer-perceived dimensions of MCCPs. To validate LDA-extracted topics, the topic coherence score will determine the optimal number of topics, selecting the model with the highest coherence score. Additionally, LDAvis will assess topic dispersion and overlap, further validating the separation and interpretability of themes. To determine the indicators for each theme, only keywords with topic weights greater than zero, rounded to three decimal places, will be retained. Lower-weighted words will be considered statistically insignificant in defining the thematic structure. This filtering process will ensure that only highly relevant keywords contribute to the final thematic framework. Using this approach, the study will identify dimensions (themes) along with their indicators (high-weight keywords), which will form the foundation for further classification.
To validate the robustness of the topics extracted by LDA, a Support Vector Machine (SVM) classifier will be employed to refine the topic assignments. Since the themes derived from LDA are probabilistic rather than deterministic, the use of SVM will ensure that each review is assigned a dominant, definitive theme. This approach will help verify the accuracy of the LDA results, confirming that the identified themes reliably represent the underlying patterns in the data. Moreover, the SVM model will be trained using the LDA-derived themes as target labels. The model’s performance will be evaluated using standard classification metrics, such as accuracy, precision, recall, and F1-score, which will provide evidence that the LDA-derived themes are reliable and consistent. For the process of SVM, the dataset will be split into training (80%), and test (20%) sets, and a grid search with five-fold cross-validation will be conducted to identify the optimal SVM hyperparameters, including kernel type (linear, RBF, or polynomial), regularization parameter (C), and gamma value. This optimization process ensures that the classifier is well-calibrated, preventing both overfitting and underfitting.
BERT and K-Means
In the second stage, Bidirectional Encoder Representations from Transformers (BERT), a deep learning-based natural language processing model, will be employed to refine and validate the creativity-related themes extracted from LDA+SVM. Unlike LDA, which identifies topics based on probabilistic word co-occurrence, BERT generates context-aware word embeddings that capture deeper semantic relationships between words and phrases. This ability makes BERT particularly suitable for analyzing consumer reviews, as it can identify subtle meanings in text that may not be evident in traditional topic modeling approaches. By integrating BERT embeddings with K-Means clustering, this study will enhance the robustness of theme identification by uncovering underlying patterns and improving the interpretability of creativity-related themes in MCCPs.
To generate BERT embeddings, each consumer review will first be tokenized using a pre-trained BERT tokenizer (bert-base-chinese), a model trained on a large corpus of Chinese texts, which will output high-dimensional vector representations encoding the semantic meaning of each review. This tokenizer segments Chinese text into meaningful subwords, ensuring that words are correctly mapped to their corresponding embeddings in the pre-trained BERT model. Instead of reducing dimensionality through PCA or other techniques, this study will retain the full-dimensional BERT feature vectors and apply K-Means clustering directly to group semantically similar reviews into distinct themes. The decision to preserve the full-dimensional embeddings is based on the need to retain as much contextual information as possible, ensuring that fine-grained semantic relationships within consumer reviews are not lost.
Following the extraction of BERT embeddings, the study will employ K-Means clustering to group semantically similar reviews into distinct themes. K-Means is a widely used unsupervised machine learning algorithm that partitions data into clusters based on similarity, making it particularly effective for grouping consumer reviews with shared perceptions. To determine the optimal number of clusters (K), two widely used validation techniques will be applied. First, the Silhouette Score will be calculated for different values of K, measuring how well each data point fits within its assigned cluster while maintaining a clear separation from other clusters. A higher silhouette score will indicate that the clustering structure is well-defined. Second, the Elbow Method will be used to analyze the sum of squared errors (SSE) for different values of K, identifying the point at which adding more clusters no longer significantly reduces variance. This combination of methods will ensure that the most meaningful and well-separated clusters are selected for analysis. Once the optimal number of clusters (K) is identified, the best clustering model will be applied to categorize the reviews, generating the theme clusters that represent consumer-perceived dimensions of MCCPs.
To summarize and extract key themes from the results of BERT + K-Means clustering, DeepSeek R1, and ChatGPT o3-mini-high will be used. DeepSeek R1 excels in deep contextual understanding, enabling it to capture complex themes within clustered reviews accurately. At the same time, ChatGPT o3-mini-high is highly effective in generating concise, contextually relevant summaries from large, unstructured text. The keyword summaries produced by both models will be compared, and their outputs are expected to be broadly consistent. These keywords will then be used to supplement and validate the results of the LDA + SVM analysis, offering further insights into the thematic structure of the consumer feedback.
Text Analysis
To ensure both the adequacy and accuracy of the identified dimensions and their corresponding indicators, a final step of text analysis will be conducted by extracting high-frequency keywords from consumer reviews. This process aims to refine and supplement the dimensions and indicators derived from the LDA+SVM and BERT+K-Means approaches. By systematically identifying frequently occurring terms associated with consumer perceptions, additional relevant indicators will be incorporated to enhance the comprehensiveness of the analytical framework. This step not only ensures that all key aspects of MCCP evaluations were adequately captured but also reinforces the robustness of the categorization process, minimizing the risk of omitting critical consumer insights.
By integrating insights from probabilistic topic modeling (LDA), supervised classification (SVM), and deep learning-based clustering (BERT+K-Means), as well as two advanced LLMs, this study will establish a rigorous and data-driven thematic framework that accurately reflects consumer perceptions of MCCPs. The final themes derived from this hybrid methodology will provide the foundation for PLS-SEM validation, enabling further investigation into how MCCP dimensions influence consumer satisfaction.
Data Structurization
Before proceeding with PLS-SEM, it is necessary to transform the unstructured textual data into a structured form suitable for quantitative analysis. Since consumer reviews are qualitative in nature, sentiment analysis will be applied to derive numerical scores that represent consumer attitudes toward different MCCP dimensions. This transformation process consists of two key steps: sentiment classification and matrix transformation, ensuring that the final dataset is appropriately structured for PLS-SEM modeling.
Sentiment Classification
To classify sentiment within consumer reviews, this study will employ two independent sentiment analysis tools: the sentiment analysis module of SPSSAU (an automated statistical analysis platform widely used for academic research in China) and the sentiment analysis API of Alibaba Cloud (i.e., Aliyun, a leading cloud computing service provider in China, specializing in AI-driven data processing and e-commerce analytics). SPSSAU’s tool categorizes text into four sentiment levels—positive, slightly positive, slightly negative, and negative—providing a nuanced understanding of sentiment variation. In contrast, Alibaba Cloud’s sentiment analysis API, a machine learning-based tool trained on extensive consumer data from Taobao’s e-commerce platform, classifies sentiment into three broader categories: positive, neutral, and hostile. The integration of these two tools will enable cross-validation, thereby improving the reliability of sentiment classification.
Since sentiment classification directly affects the validity of the final quantitative dataset, special attention must be given to the reliability of the sentiment results analyzed by the tools. Given their maturity and reliability, a direct reliability metric, such as Cohen’s Kappa or Krippendorff’s Alpha, was deemed unnecessary. However, for situations where one tool labels a review as positive while the other categorizes it as negative, manual verification will still be needed to determine the correct sentiment classification. Thus, we introduce a procedure of manual judgment, where an expert with more than 8 years of experience in the research field of MCCPs will distinguish discrepancies among the results, ensuring that positive and negative elements are assigned to the appropriate MCCP dimensions.
Given that SPSSAU provides four sentiment levels while Alibaba Cloud offers only three, this study will adopt SPSSAU’s sentiment classification as the primary scoring system, with Alibaba Cloud’s classification serving as a validation tool. To standardize sentiment ratings, the following scoring system will be applied: (1) Positive = 7; (2) Slightly Positive = 5; (3) Neutral =4; (4) Slightly Negative = 3; (5) Negative = 1. This scoring system is designed to enhance differentiation among sentiment levels, ensuring that subtle variations in consumer sentiment are effectively captured in the final dataset.
Missingness Handling
After structuring the dataset into an indicator-sentiment score matrix, missing values may still exist due to incomplete sentiment analysis, partial keyword matching, or cases where consumer reviews do not contain sufficient information to map specific indicators. Since PLS-SEM performs best with complete datasets for optimal estimation, to ensure model validity and robustness, missing values must be handled appropriately before proceeding with modeling.
To address this issue, this study will apply K-Nearest Neighbors (KNN) imputation, a widely used technique for handling missing data. KNN imputation identifies the k most similar observations (neighbors) in the dataset and estimates the missing value based on the weighted average or mode of these neighbors. This approach ensures that missing values are replaced based on patterns observed in the existing data rather than relying on arbitrary assumptions, thereby preserving the integrity of consumer sentiment information. The decision to use KNN imputation instead of Mean, Regression, and Multiple Imputation (MI) is based on methodological considerations related to the nature of the sentiment score dataset: Mean imputation oversimplifies missing data by replacing values with the average, reducing variance and distorting sentiment distributions. Regression imputation assumes fixed linear relationships, which do not capture the diverse linguistic expressions in consumer sentiment. MI generates probabilistic values that may fall outside the predefined sentiment scale, complicating interpretation. In contrast, KNN imputation preserves data structure, maintains scale consistency, and imputes missing values based on real observed patterns, ensuring methodological robustness.
The KNN imputation process will proceed as follows. First, the dataset will be examined for missing values in indicators’ scores. Suppose an indicator is missing from a given review. In that case, it suggests that either the sentiment classification failed to detect relevant content or the review did not contain an explicit sentiment toward that dimension. Next, the number of nearest neighbors (k) will be selected using cross-validation techniques to optimize imputation accuracy, with different k values (e.g., 3, 5, and 7) tested to determine the most effective configuration. Once the optimal k-value is identified, the algorithm will find similar consumer reviews based on the available sentiment scores in other MCCP dimensions. The weighted average of the k neighbors will then replace the missing value. Finally, statistical validation will be performed to compare the distribution of the original and imputed data, ensuring that the imputation process does not introduce bias or distort the dataset’s structure.
The missingness in sentiment scores primarily arises from cases where consumer reviews lacked explicit expressions related to specific dimensions rather than from random or systematic data loss (Kim & Im, 2018; Li et al., 2023; Mellinas & Leoni, 2024; Mitra et al., 2023; Zhou et al., 2024), addressing them through statistical reporting would provide limited additional insights beyond what is already captured by the text-mining framework. Given that missing values were distributed across various MCCP dimensions rather than concentrated in specific categories, KNN imputation was selected to leverage existing sentiment scores and maintain contextual consistency in consumer evaluations.
Reliability Testing
To ensure the reliability of the structured dataset before conducting PLS-SEM analysis, this study will evaluate internal consistency using McDonald’s Omega (ω). While Cronbach’s Alpha (α) has traditionally been the most widely used measure for reliability testing, recent research suggests that McDonald’s Omega provides a more robust and theoretically sound estimate of internal consistency, particularly in cases where the scale exhibits heterogeneity among its indicators (Njeri et al., 2024; Orçan, 2023; Trizano-Hermosilla & Alvarado, 2016). Given that our dataset is derived from consumer sentiment analysis of textual reviews rather than traditional survey-based Likert-scale responses, heterogeneity in item loadings is expected, making McDonald’s Omega a more appropriate reliability metric. Unlike Cronbach’s Alpha, which assumes tau-equivalence (i.e., all items contribute equally to the latent construct), McDonald’s Omega does not require this assumption. Instead, it accounts for the actual variance explained by each indicator, providing a more precise measurement of internal consistency. It is imperative in sentiment-based data, where different MCCP dimensions may have varying levels of contribution to consumer perception. For interpretation, a McDonald’s Omega value above 0.80 indicates high reliability, while values between 0.70 and 0.80 suggest good reliability; values between 0.60 and 0.70 are considered acceptable, and those below 0.60 indicate low reliability and potential measurement concerns. In this study, SPSSAU will be used to calculate the McDonald’s Omega of each dimension and the overall result.
PLS-SEM Model Specification
Given the nature of the extracted dimensions from consumer sentiment analysis, this study will adopt a formative measurement model rather than a reflective measurement model. The choice of a formative specification is based on the theoretical and methodological considerations regarding the nature of MCCP dimensions in consumer perception. A reflective measurement model assumes that latent constructs cause their indicators, meaning that all indicators should be highly correlated and interchangeable with one another. If one indicator is removed, the meaning of the construct remains unchanged. This approach is suitable for traditional psychological scales, where responses to different items reflect an underlying trait (e.g., satisfaction or brand trust). However, in our case, MCCP perception is not a single latent trait that generates consumer sentiment scores but rather an emergent construct composed of distinct dimensions. These dimensions are not necessarily correlated, but together, they define the overall perception of MCCPs. By contrast, a formative measurement model assumes that indicators collectively form the latent construct, meaning that each dimension uniquely contributes to the construct’s meaning. In the context of consumer MCCP perception, dimensions such as aesthetic appeal, craftsmanship, functionality, and emotional engagement independently shape the overall assessment of a product. The absence or variation in one dimension does not invalidate the construct; instead, it alters its overall composition and meaning. Given that our sentiment-based dataset captures diverse consumer expressions toward different aspects of MCCPs, a formative approach allows us to model these dimensions as independent yet complementary contributors to MCCPs perception. Thus, using a formative measurement model in PLS-SEM ensures that our analysis reflects the causal relationship between dimensions and overall perception rather than assuming that consumer sentiment toward one dimension is necessarily correlated with another. This approach provides a more accurate representation of how consumers evaluate MCCPs, allowing us to quantify the relative importance of each dimension in shaping consumer satisfaction.
In PLS-SEM with a formative measurement model, it is crucial to report appropriate indicators while avoiding internal consistency metrics, such as Composite Reliability (CR) and Average Variance Extracted (AVE), which are only applicable to reflective models. Key essential indicators include Variance Inflation Factor (VIF), ensuring multicollinearity is within acceptable limits (VIF < 5, ideally <3.3), and outer weights, which indicate each indicator’s contribution to the construct, with significance tested via bootstrapping (p < 0.05). Path coefficients also should be reported to assess the strength and direction of relationships (range: -1 to 1, p < 0.05). Additionally, R² will measure the variance explained by independent variables (≥0.25 weak, ≥0.50 moderate, ≥0.75 vigorous), while f² will quantify effect size (≥0.02 small, ≥0.15 medium, ≥0.35 large). Optional indicators include Q² (predictive relevance), which is used to assess out-of-sample prediction (Q² > 0 indicates relevance). To determine the primary indicator influencing the dimension of consumer satisfaction, Importance-Performance Map Analysis (IPMA) will provide managerial insights. Because PLS-SEM prioritizes prediction over overall model fit and formative constructs are defined by their indicators rather than inferred from them (Henseler et al., 2016), the model evaluation does not rely on traditional model fit indices (e.g., RMSEA, SRMR, CFI). Thus, the metrics of model fit are optional. At last, PLS-SEM researchers suggest that to detect a medium effect size (f² = 0.15) with 80% power, at least 150-200 cases are required (Cohen, 1992; Hair et al., 2021); thus, the threshold for data size in PLS-SEM is set at 200.
The PLS-SEM analysis will be conducted in SmartPLS 4.0.0.
Results
Data Overview
The dataset (containing 11,035 consumer reviews) was collected in November 2023, covering reviews from November 2021 to November 2023. This two-year span provides a comprehensive representation of consumer perceptions, minimizing the influence of short-term sentiment fluctuations and ensuring relevance to contemporary MCCP trends. The reviews are primarily in Simplified Chinese, with a small proportion in Traditional Chinese and English, which aligns with Tmall’s primary user base in Mainland China. The limited presence of English reviews does not diminish the study’s global relevance, as consumer behavior in cultural product consumption shares cross-market similarities. Insights from this research can help museums worldwide optimize their cultural product strategies and enhance their engagement with Chinese consumers.
Data Preprocess
The data preprocessing followed a systematic approach to ensure quality and consistency. The first step involved removing empty or non-informative reviews, such as those with default system messages like “此用户没有填写评价” (translated: “This user did not provide a review”). Reviews with extremely short content, such as a single word or a generic phrase like “好” (good), “棒” (great), “还行” (okay), or “不错” (not bad), were excluded. These reviews, while indicative of sentiment, did not provide substantive information about the product and were unsuitable for evaluating MCCP dimensions. Similarly, reviews containing only emojis or special characters were removed, as they lacked explicit textual content for analysis and interpretation. Duplicate reviews, often resulting from bulk purchases or copy-pasting behavior, were also identified and removed to prevent redundancy and bias in the dataset.
The data cleaning process was conducted in two stages: the first stage, in July 2024, resulted in a labeled training set of 2,125 reviews, and the second stage, in January 2025, expanded the dataset to 3,414 reviews. While the exact steps of the two cleaning processes slightly differed—particularly with the addition of a tokenization step in the 2024 phase—both followed a consistent overall methodology, as detailed in the methodology section. This temporal separation between the two stages helped avoid potential biases from revisiting the data too closely, allowing for a more objective approach to processing the dataset. This added layer of data handling can be seen as a strength, ensuring the integrity and reliability of the results.
Dimensions and Indicators
To extract consumer-perceived MCCP dimensions, a two-stage hybrid approach integrating LDA and SVM, as well as BERT and K-Means, was employed with the assistance of an LLM. This section presents the results of each step, including topic optimization, classification performance, and clustering validation.
LDA Topic Modeling
The first stage applied
LDA to identify underlying themes within consumer reviews (using the July 2024 dataset, comprising 2,125 reviews). To determine the
optimal number of topics, coherence scores were calculated for different topic numbers. The
coherence score plot (
See Figure 2) indicated that
six topics yielded the highest coherence, confirming their suitability for further analysis.
After establishing
K = 6 as the optimal number of topics, the top-weighted keywords for each theme were identified. These keywords served as the basis for interpreting topics and naming themes. The identified topics and their corresponding keywords suggest that consumer perceptions of MCCPs are structured around six main themes:
Functionality & Quality, Craftsmanship & Price, Design Appeal, Cultural Perception, Emotional Responses, and Aesthetic Impression.
Table 2 summarizes the extracted themes, along with the most relevant keywords. The initial probabilistic topic distribution is presente
d in Figure 3.
SVM Classification for Theme Labeling
Following LDA, an SVM classifier was trained to refine the topic assignments. Since LDA-derived themes are probabilistic rather than deterministic, the SVM classification process helped improve accuracy by ensuring that each consumer review was assigned to a dominant theme. The dataset from July 2024 already had clear dominant themes derived from LDA, and the SVM model was used to validate whether LDA had correctly extracted these themes. During the validation process, the entire dataset of 2,125 reviews was split into 80% for training (1,700 reviews) and 20% for testing (425 reviews). The SVM model achieved impressive results, with an accuracy of 94.12%, a precision of 94.26%, a recall of 94.12%, and an F1-score of 94.11%. These results demonstrate that the SVM model reliably predicted the LDA-derived themes, confirming the robustness and validity of the LDA analysis.
BERT and K-Means Clustering Validation
BERT and K-Means clustering were used as supplementary methods to LDA. The clustering analysis revealed that the highest silhouette score was achieved with 7 clusters (
See Figure 4).
After applying BERT to generate 768-dimensional vector representations for the consumer reviews, K-Means clustering was used to group these reviews into distinct clusters. For each cluster, a label was assigned based on its predominant theme, and the reviews were grouped accordingly. The comment examples of each cluster are shown in
Table 3.
We then passed 100 randomly selected reviews from each cluster (due to input length limitations) into two advanced LLMs, namely DeepSeek R1 and ChatGPT o3-mini-high. These models were tasked with summarizing the key content of the reviews in each cluster using one to two words (
See Table 4). Upon comparing the results in
Table 4, we found that those from both models were nearly consistent. The consistency between the keywords generated by the language models and the themes identified by LDA provided additional support for the reliability of the LDA clustering.
Based on the results of LDA+SVM and BERT+K-Means, we propose a framework of dimensions and indicators for further analysis (See
Table 5).
Since PLS-SEM latent variables typically require 3-4 observed indicators per construct, the current framework does not yet meet this criterion. To ensure both the adequacy and accuracy of dimensions and indicators, we further refined the initial framework derived from LDA+SVM and BERT+K-Means. Specifically, we applied SPSSAU’s text analysis tool to segment the review texts and identify high-frequency keywords. These high-frequency terms were then used to supplement the previously established indicators, enhancing the robustness of the measurement framework. Since SPSSAU’s text analysis tool initially extracted 1,000 keywords (with frequencies ranging from 2 to 1,769), processing all keywords would be impractical and unnecessary. To refine the dataset, a frequency threshold was established, and keywords with fewer than 10 occurrences were excluded. As a result, 152 keywords were retained for analysis. We rearranged the 152 keywords according to the framework of dimensions and indicators (
See Table 6).
In summary, at this stage, LDA provided an initial framework, grouping reviews based on keyword co-occurrence. SVM improved classification accuracy, ensuring robust theme assignment. BERT+K-Means validated the semantic grouping, refining the theme framework, and text analysis supplemented it with specific keywords. After this process, we determined the dimensions of consumers’ perception of an MCCP and the indicators they used to describe it.
Sentiment Analysis
To quantify consumer sentiment toward different MCCP dimensions, this study applied a dual-method sentiment analysis approach, integrating SPSSAU’s sentiment classification module and Alibaba Cloud’s sentiment analysis API. Each consumer review was assigned a sentiment score based on SPSSAU’s four-level classification system: Positive (7 points), Slightly Positive (5 points), Slightly Negative (3 points), and Negative (1 point). To enhance classification reliability, Alibaba Cloud’s sentiment classifier, which categorizes sentiment into three levels (Positive, Neutral, and Negative), was used for cross-validation. For Neutral cases, we will assign 4 points to them.
To further ensure sentiment classification accuracy, manual verification was conducted for cases where the two classifiers produced conflicting results (i.e., one classifier labeled a review as positive and the other as negative) by an associate professor in the MCCP research field with over 13 years of experience in the field. Additionally, reviews flagged as potential errors by the Alibaba Cloud tool were examined. Each sentence and its clauses within these reviews were carefully analyzed and marked based on its expressed meaning, and sentiment scores were reassigned accordingly to reflect the most accurate interpretation. In this process, we did not adjust sentiment classifications labeled as slightly positive or slightly negative by SPSSAU based on Alibaba Cloud’s sentiment analysis tool. The primary reason for this decision is that Alibaba Cloud’s tool provides only three broad sentiment categories (i.e., positive, neutral, and negative).
In contrast, SPSSAU offers a more granular four-level classification. Given that Alibaba Cloud’s labels are coarser, they lack the necessary detail to refine SPSSAU’s sentiment scores effectively. Therefore, as long as there were no direct contradictions between the two tools (e.g., one labeling a review as positive and the other as negative), we retained SPSSAU’s original four-level classification.
The distribution of sentiment scores is summarized in
Table 7 below.
Score Matrixization
To facilitate quantitative analysis using PLS-SEM, the unstructured textual data was transformed into a structured sentiment score matrix. This matrix transformation process ensured that each consumer review was systematically linked to the MCCP dimensions identified through LDA+SVM, BERT+K-Means, and high-frequency keyword analysis. By mapping sentiment scores to relevant dimensions, the study enabled a structured evaluation of consumer perceptions across multiple aspects of MCCPs. Once sentiment classification was finalized, the sentiment scores were mapped to their respective MCCP dimensions. For reviews that contained multiple sentiment expressions within a single entry, the sentiment scores were distributed accordingly. For instance, a review expressing satisfaction with the design but dissatisfaction with the price resulted in a positive sentiment score for design aesthetics and a negative sentiment score for the price. If a review mentioned only one dimension, its sentiment score was assigned exclusively to that category, while the remaining dimensions were left unassigned. The structured dataset was then organized into a sentiment score matrix, where each row represented a consumer review, each column corresponded to an MCCP dimension, and the values indicated sentiment scores assigned to each dimension. To ensure more accurate and reliable mappings, the matrix was reviewed by the expert we invited. A sample of this structured matrix is shown in
Table 8.
Missing Data Processing
In the initial sentiment score matrix, a significant proportion of missing values was observed, with some indicators exhibiting missing rates as high as 99% (
See Figure 5). Since PLS-SEM requires complete datasets for accurate model estimation, handling missing sentiment scores was a critical step before proceeding with structural modeling. Missing values in the sentiment score matrix primarily arose due to three reasons. First, some consumer reviews did not explicitly mention specific MCCP dimensions. For instance, a review may discuss design aesthetics and cultural attributes but not address price or functionality. In such cases, no sentiment score could be assigned to the unmentioned dimensions. Second, sentiment analysis tools occasionally fail to classify a sentiment score due to ambiguous phrasing or indirect expressions in consumer reviews. Thirdly, the selection of keywords, particularly those that barely met the minimum occurrence threshold (i.e., 10) in the dataset, also contributed to the issue of a high missing rate.
To effectively address this issue, a keyword aggregation approach was implemented. Instead of treating each keyword as an independent metric, the 152 identified keywords (Column 3 in
Table 6) were grouped under their respective indicators (Column 2 in
Table 6), with their sentiment occurrences summed to generate aggregated indicator scores. This aggregation allowed missing values at the keyword level to be absorbed within the broader indicator-level structure, thereby reducing data missingness (black line in
Figure 5) and streamlining the complexity of the PLS-SEM model.
To systematically address missing values, KNN imputation was applied. While the 20% missing data threshold is commonly used in structured survey data, it is less applicable to consumer reviews, where missing sentiment scores arise from content-driven omissions rather than data errors. Strictly following the 20% rule would lead to excessive data loss, excluding valuable insights. Although studies suggest KNN can handle up to 90% missingness (Marchang & Tripathi, 2021), such high thresholds increase uncertainty. To strike a balance, this study adopts a 50% threshold, ensuring enough observed sentiment scores for reliable imputation while avoiding excessive reliance on estimated values. Thus, before applying KNN, cases with excessive missing values (a missing rate of more than 51%) were removed from the dataset. After this process, 523 entries remain, and the average missing rate falls from 76.04% to 46.49%.
By removing reviews with excessive missing values and applying KNN imputation, the study ensured that all retained cases had complete sentiment scores across all MCCP dimensions. This imputation step significantly enhanced the dataset’s reliability and validity, providing a solid foundation for subsequent PLS-SEM analysis.
Measurement Model Construction
The MCCP perception model was developed through a data-driven, multi-step approach that integrates text mining, sentiment analysis, expert validation, and structural modeling. This process ensures that the final model is empirically grounded while remaining theoretically relevant to existing frameworks of consumer perception. Building on the previous stages, we have systematically extracted indicators from consumer reviews through text mining techniques, including LDA+SVM and BERT+K-Means. These indicators were then grouped into broader dimensions based on expert validation, ensuring a structured and conceptually meaningful framework. Now, before proceeding to the reliability assessment, we first establish the theoretical measurement model, which organizes these dimensions into a structured representation of MCCP perception, consumer experience, and satisfaction. This model will serve as the foundation for subsequent PLS-SEM analysis.
In the literature review, we identified three key factors — Product Perception, Consumption Experience, and Satisfaction —as the foundation for understanding consumer evaluation of MCCPs. Based on this framework, we structured the dimensions and indicators extracted from text analysis into these categories, ensuring consistency between empirical findings and theoretical insights. Each of these constructs consists of multiple dimensions, which were derived from text mining, expert validation, and theoretical alignment. The final measurement model is presented in
Table 9, which details the latent constructs, their corresponding dimensions, and the indicators associated with each.
The hierarchical model structure is illustrated in
Figure 6, which visualizes the relationships between latent constructs and their indicators.
Descriptive Statistics and Reliability
Before conducting the reliability analysis, we first present the descriptive statistics of the extracted sentiment scores for each indicator (
See Table 10).
In
Table 10, the mean values (5.41–6.15) suggest a generally positive consumer sentiment, with standard deviations (1.0–1.4) indicating moderate variability. The full range (1–7) observed in all indicators confirms diverse sentiment distribution across respondents.
To evaluate the internal consistency of the measurement model, McDonald’s Omega was computed. Unlike Cronbach’s Alpha, McDonald’s Omega does not assume equal contributions from all indicators, making it a more robust measure of internal consistency. The overall McDonald’s Omega for the nine dimensions was 0.794, indicating good reliability. The detailed reliability results, including item-deleted McDonald’s Omega values, are presented in
Table 11.
In Table 12, the reliability analysis of the latent construct using McDonald’s Omega indicates that Product Perception (0.786) demonstrates good reliability, suggesting that its indicators are well-aligned in capturing consumer evaluations of MCCPs. Consumer Satisfaction (0.675) and Consumption Experience (0.629) fall within the acceptable range, indicating moderate internal consistency. While these values are slightly lower than the conventional 0.70 threshold for strong reliability, they are still within an acceptable range for exploratory research, particularly in formative measurement models where indicator variance is expected. PLS-SEM Report
Outer Weight
The outer weights analysis provides insights into the relative contributions of each indicator to its respective construct, helping to understand which factors play a more significant role in shaping consumer perceptions and experiences. The results in
Table 13 indicate that for Product Perception, Product Creativity (0.853) is the most influential factor, suggesting that consumers primarily evaluate MCCPs based on their creative attributes. At the same time, Product Functionality (0.061, p-value=0.325) has a negligible impact (). In the aspect of Consumption Experience, Product Price (0.711) and Product Quality (0.514) are the strongest contributors, emphasizing the importance of pricing fairness and product reliability, whereas Service Experience (0.302) plays a supporting role. Finally, in terms of Consumer Satisfaction, Emotional Response (0.690) has the highest weight, indicating that emotional engagement has a significant influence on overall consumer satisfaction.
Path Coefficients and Predictive Effects
The structural model was evaluated using path coefficients (β), t-values, and p-values to examine the hypothesized relationships.
Table 14 summarizes the results. All paths showed statistically significant relationships (p < 0.05), with Product Perception demonstrating a substantial direct effect on Consumption Experience (β = 0.545, t = 14.153, p < 0.001) and Consumer Satisfaction (β = 0.442, t = 7.324, p < 0.001). Meanwhile, Consumption Experience had a moderate effect on Consumer Satisfaction (β = 0.237, t = 3.797, p < 0.001).
R² and f² Values
The predictive power of the model was assessed using R² values, which indicate the proportion of variance in the dependent variable that the predictors explain. The R² values for Consumption Experience and Consumer Satisfaction were 0.297 and 0.366, respectively, reflecting moderate explanatory power. The f² values demonstrated the relative effect size of predictors. Product Perception had a significant effect on Consumption Experience (f² = 0.422) and a medium effect on Consumer Satisfaction (f² = 0.217). In contrast, the effect of Consumption Experience on Consumer Satisfaction was small (f² = 0.062).
Table 15.
R² and f² Values.
Table 15.
R² and f² Values.
| Dependent Variable |
R² |
f² (Effect Sizes) |
| Consumption Experience |
0.297 |
Product Perception→Consumption Experience: 0.422 (Large)
|
| Consumer Satisfaction |
0.366 |
Product Perception → Consumer Satisfaction: 0.217 (Medium) Consumption Experience → Consumer Satisfaction: 0.062 (Small)
|
Multicollinearity
To ensure the validity of the formative constructs in the model, multicollinearity was assessed using VIF values. As shown in
Table 16, all VIF values ranged from 1.001 to 1.433, which is well below the critical threshold of 5.0, indicating that there is no significant multicollinearity among the predictors.
Q² Value
The indicator-level predictive relevance was assessed using the cross-validated communality Q² values (
Table 17). Among all indicators, only Product Creativity (Q² = 0.046) demonstrated weak predictive relevance, suggesting that it contributes partially to its latent construct. However, all other indicators, including Product Quality (Q² = -0.212), Product Price (Q² = -0.100), and satisfaction-related dimensions (Q² < 0), exhibited no predictive relevance.
Model Fit
To evaluate the overall fit of the model, several fit indices were examined. The standardized root mean square residual (SRMR) was 0.046, which is well below the recommended threshold of 0.08, indicating a good model fit. Additionally, the unweighted least squares discrepancy (d_ULS = 0.097) and geodesic discrepancy (d_G = 0.030) were found to be low, suggesting a minimal discrepancy between the observed and estimated models. Finally, the normed fit index (NFI) was 0.899, which is close to the recommended threshold of 0.90, indicating an acceptable model fit. These results collectively confirm that the proposed model demonstrates a satisfactory level of model fit and provides reliable structural estimates.
Summary
This study identified Product Perception, Consumption Experience, and Satisfaction as the core dimensions shaping consumer evaluations of MCCPs. Text mining and sentiment analysis were used to extract key consumer themes, which were structured into measurable indicators and aggregated into higher-order constructs. To address missing data, KNN imputation was applied with a 50% threshold, ensuring sufficient observed values while preserving data integrity. Reliability analysis confirmed that the extracted dimensions exhibited acceptable internal consistency, supporting their suitability for further analysis.
PLS-SEM results validated the proposed theoretical framework, demonstrating that Product Perception significantly influenced both Consumption Experience and Satisfaction, while Consumption Experience played a secondary role in shaping Consumer Satisfaction. No multicollinearity issues were found, ensuring the stability of the model estimates. Among the indicators, Product Creativity, Product Price, and Emotional Response emerged as the strongest contributors to their respective dimensions. In contrast, Product Functionality showed minimal impact on consumers' product perception, suggesting that functionality may not be a primary evaluation criterion for MCCPs. Moreover, IPMA analysis highlighted areas for strategic improvement, indicating that while some dimensions performed well, others—such as Product Functionality and Product Quality—had moderate importance but lower performance, presenting opportunities for enhancement. Finally, predictive relevance analysis showed that most indicators had limited predictive power, except for Product Creativity, which exhibited weak predictive relevance. The validated PLS-SEM model is illustrated in
Figure 8, and the decisions to the Hypotheses are summarized in
Table 18.
Discussion
Methodological Contribution
This study introduces LLM-assisted semantic analysis as a methodological advancement in extracting consumer perceptions from large-scale textual data, significantly enhancing context-aware sentiment classification, thematic identification, and discourse structuring. Traditional text analysis methods, such as manual coding and keyword-based sentiment analysis, often fail to capture implicit meanings, contextual nuances, and linguistic complexities in consumer-generated discourse. By leveraging LLMs, this study enables more precise sentiment classification, identifying consumer emotions and attitudes beyond surface-level expressions. Additionally, LLM-assisted thematic analysis facilitates the discovery of underlying themes in consumer feedback, improving the accuracy of clustering product-related discussions. Furthermore, advanced discourse structuring techniques powered by LLMs enable a more comprehensive and contextualized understanding of how consumers evaluate creativity, aesthetics, functionality, emotional engagement, and cultural values in MCCPs. This methodological innovation addresses key limitations of conventional text mining approaches by reducing human bias, improving processing efficiency, and enabling large-scale automated analysis of consumer discourse. Unlike traditional methods that rely on predefined dictionaries or rule-based systems, LLM-assisted models adapt dynamically to diverse linguistic expressions, capturing subtle variations in consumer sentiment and product evaluations. This adaptability is particularly valuable in cultural and creative industries, where product perception is often highly subjective and emotionally driven.
Beyond enhancing analytical precision, this study establishes a scalable and AI-driven framework for future research in cultural heritage marketing and consumer behavior. The integration of LLM-assisted semantic analysis provides a replicable and extensible approach for analyzing consumer-generated content, offering insights that are both theoretically robust and practically actionable. As an early exploration of LLM-assisted analysis in MCCP evaluation, this research provides insights into the potential of AI-driven consumer sentiment analysis in understanding cultural product engagement. Rather than establishing a definitive benchmark or foundation, this study provides an empirical example that may inspire future research to refine and expand the application of LLMs in analyzing consumer perceptions within the cultural and creative industries.
Theoretical Contribution
Theoretically, this study contributes to the literature by constructing a data-driven framework for evaluating MCCPs, confirming the central role of Creativity, along with Aesthetics, Emotion, Design, and Culture, in shaping consumer perceptions. The research highlights Creativity as the primary driver of consumer satisfaction, reinforcing that innovation is a crucial determinant in MCCP evaluation, aligning with the findings of previous studies (He & Timothy, 2024; Xu & Zhang, 2024). While previous studies have often treated creativity as a subset of design (Han et al., 2021; Wodehouse & Casakin, 2022), our findings demonstrate that it functions as an independent dimension with distinct consumer expectations. This distinction underscores the necessity for MCCPs to transcend traditional design principles and proactively incorporate creative and innovative elements to capture consumer interest and engagement.
Moreover, our study contributes to the Diffusion of Innovation Theory of Rogers (2003) by providing new insights into how MCCPs gain consumer acceptance, expanding its application in the context of cultural and creative products. Rogers (2003) identifies five key attributes influencing the adoption of innovations in his theory: relative advantage, compatibility, complexity, observability, and trialability. The findings of this study provide new insights into how these attributes operate in the MCCP domain, particularly in the context of cultural product evaluation.
The concept of relative advantage is evident in how creativity significantly enhances the perceived value of MCCPs (Im & Workman, 2004; Kumar et al., 2024; Li et al., 2021; Modig & Rosengren, 2014). Unlike conventional mass-market products, MCCPs differentiate themselves through unique and innovative designs that elevate their cultural significance and artistic appeal. It reinforces the idea that product differentiation in the MCCP industry should be driven by creative reinvention rather than merely replicating cultural elements.
Compatibility, which measures the alignment between innovation and consumer values, plays a crucial role in MCCP adoption, particularly through cultural compatibility rather than functional utility. Unlike mainstream consumer products that emphasize efficiency and practicality, we discovered that MCCP consumers prioritize cultural authenticity, emotional resonance, and aesthetic alignment over pure functionality, supported by the previous research on MCCPs (Beverland & Farrelly, 2010; Hui Cheng, Shi-jian Luo, et al., 2023; Cheng & Qiu, 2023; Kreuzbauer & Keller, 2017; Yu et al., 2022). Our findings suggest that successful MCCPs must integrate historical narratives, artistic traditions, and personal identity to establish deeper consumer connections. By embedding authenticity and emotional depth into product design, MCCPs can strengthen their cultural appeal, fostering stronger engagement and long-term adoption.
The role of complexity in MCCPs differs from its conventional interpretation in the Theory of Diffusion of Innovation. Unlike standard products, where higher complexity may deter adoption, our findings suggest that MCCP consumers are willing to embrace intricate and creative designs as long as they maintain cultural coherence, aligning with the previous studies (Buschgens et al., 2024; Liu & Zhao, 2024). It indicates that complexity in MCCPs is not necessarily a barrier but rather an opportunity—provided that the creative elements remain interpretable and meaningful within their cultural context.
Observability and trialability—the extent to which an innovation’s benefits can be seen and experienced before adoption—are increasingly shaped by digital engagement strategies in the MCCP industry. Digital tools such as online previews, consumer-generated content, and augmented reality experiences enhance observability by allowing consumers to explore product attributes virtually before making a purchase (Lavoye et al., 2023; Sekri et al., 2024; Zulkarnain et al., 2024). This technological integration helps bridge the gap between digital and physical consumer experiences, making MCCPs more accessible and engaging.
Practical Implications
These findings have important implications for MCCP development, commercialization, and consumer engagement strategies. One of the primary challenges in the MCCP industry is product homogeneity (Hui Cheng, Xu Sun, et al., 2023; H. Li et al., 2024), where many products fail to stand out due to repetitive motifs and lack of originality. Given that creativity is the strongest driver of consumer satisfaction discovered by this research, museums, and cultural enterprises must shift their design strategies away from standardized cultural symbols and instead focus on original, narrative-driven innovations. MCCPs should blend historical authenticity with contemporary design elements to ensure they remain culturally relevant yet commercially viable.
Our findings underscore the strategic importance of consumer loyalty in driving MCCP growth. Consumers with strong emotional connections to MCCPs are more likely to repurchase, recommend, and co-create. Museums and brands should enhance engagement through interactive storytelling, limited-edition collaborations, and personalized designs. One practical approach, supported by previous studies (Y. Li et al., 2024; R. Zhang et al., 2024; Zou et al., 2024), is a cultural points system that rewards purchases, museum visits, and content creation, allowing consumers to redeem points for exclusive experiences and co-creation opportunities. Furthermore, consumer co-creation should be actively encouraged, aligning with research such as Y. Liu (2024), Shaw et al. (2021), and Monteiro et al. (2023). Museums can implement user-generated content (UGC) initiatives, enabling consumers to design, customize, and collaborate on MCCPs, thereby enhancing authenticity, cultural value, and consumer ownership.
Furthermore, the insights from our study provide a pathway for cultural institutions to transition towards financial sustainability. The traditional reliance on government funding for cultural heritage preservation is becoming increasingly challenging (Christensen, 2025; Thuc et al., 2024). By strengthening MCCP market viability through creativity-driven differentiation and consumer loyalty initiatives, museums can develop self-sustaining business models. Encouraging habitual engagement through loyalty programs (Frullo & Mattone, 2024; Thakker et al., 2024) and fostering brand attachment through cultural storytelling (Choi et al., 2024; Zimand-Sheiner, 2024) can reduce financial dependency on public funding while ensuring cultural heritage remains an integral part of contemporary consumer culture.
Lastly, the LLM-assisted semantic analysis provides a scalable and adaptable framework for integrating AI into heritage management and heritage marketing, enabling institutions to make data-driven strategic decisions based on large-scale consumer feedback. By enhancing the interpretation of consumer sentiment, thematic trends, and cultural preferences, this approach enables heritage organizations to optimize MCCP design, refine their marketing strategies, and enhance visitor engagement. The ability to process vast amounts of unstructured text data in real time enables a more nuanced understanding of consumer perceptions, ensuring that heritage products align with audience expectations while maintaining historical authenticity. Furthermore, by leveraging advanced natural language processing techniques, this method empowers heritage institutions and creative industries to anticipate market shifts, enhance consumer experience, and drive innovation, ultimately fostering sustainable cultural commercialization and broader public appreciation of heritage assets.