1. Introduction
1.1. Definition and Historical Development
Gamified teaching, also referred to as instructional gamification, denotes the use of selected game design elements in non-game learning activities in order to shape participation, persistence, and learning outcomes (Deterding et al., 2011; Kapp, 2012). Typical elements include points, badges, levels, quests, leaderboards, narratives, and rapid feedback loops, but the defining feature is not the presence of a full game world. Instead, gamification treats game elements as configurable design resources that can be embedded into existing curricula and tasks. For conceptual clarity, it is important to distinguish gamification from digital game-based learning, where a complete game serves as the primary learning vehicle, and from serious games, where learning is integrated into the game system itself (Gee, 2003; Plass et al., 2015; Prensky, 2003). Recent reviews continue to reaffirm the importance of this conceptual distinction, emphasizing that conflating gamification with full-scale educational games may obscure differences in instructional control, learner agency, and evaluative criteria, particularly in formal education settings (Chan & Lo, 2024).
Recent scholarship emphasizes that instructional gamification is best understood as a theoretically grounded design approach rather than a uniform motivational solution. While early foundations drew on long-standing ideas about activity, feedback, and social participation, contemporary research highlights how digital platforms enable fine-grained tracking of learner behavior and context-sensitive reinforcement. As gamification expanded from small-scale classroom interventions to platform-level implementations, theoretical integration became increasingly necessary to explain heterogeneous effects across learners and tasks. Self-determination theory remains the dominant explanatory framework, with recent work emphasizing how autonomy-supportive, competence-enhancing, and socially meaningful game elements predict sustained engagement and learning quality in digital environments (Ryan & Deci, 2020; Howard et al., 2021). Complementary perspectives have been updated through modern empirical research, including reinforcement and feedback design in digital learning systems (Sailer & Homner, 2020), self-efficacy and perceived agency in gamified environments (Furdu et al., 2022), expectancy–value processes in technology-mediated learning, social comparison and leaderboard effects (Dichev & Dicheva, 2023), achievement goal orientations in game-based learning contexts (Daumiller et al., 2021), and achievement emotions such as enjoyment, anxiety, and boredom in online and gamified learning (Loderer et al., 2020).
Recent meta-analytic evidence suggests that these theoretical mechanisms operate as conditional moderators rather than universal drivers, with learning effects varying substantially according to task characteristics, instructional duration, learner variables, and contextual conditions (Li et al., 2023). Meta-analytic synthesis further suggests that effects depend on design fit rather than element accumulation, calling for explicit links between mechanisms, tasks, and contexts (Sailer & Homner, 2020).
1.2. Gamified Teaching in General Education
In general education, gamified teaching has been adopted as a scalable strategy for increasing task engagement and supporting practice, especially in subjects that rely on repeated rehearsal and formative feedback. Research and practitioner discourse often connect gamification to game-based learning traditions that emphasize identity, agency, and situated problem solving (Gee, 2003), as well as to classroom-management aims such as participation regulation and goal setting (Kapp, 2012). Studies also indicate that well designed gameful features can structure attention and provide feedback that supports mastery, although overly competitive elements can redirect attention toward performance displays rather than learning goals (Dichev & Dicheva, 2023). Recent syntheses further indicate that competitive and comparative mechanics yield highly heterogeneous effects across learner populations, depending on motivational orientations, prior achievement, and classroom climate (Li et al., 2023).
Across the broader educational ecosystem, gamification is increasingly intertwined with digital learning infrastructures, including learning management systems, mobile applications, and analytics dashboards. This infrastructure makes it easier to implement adaptive task sequences and to record behavioral traces that can complement self-report measures of motivation or attitudes (Plass et al., 2015). At the same time, reviews of digital-game-related learning caution that engagement benefits are not automatic and depend on how play features are aligned with instructional objectives and social arrangements (Cornillie et al., 2012; deHaan et al., 2010). More recent reviews reiterate that engagement indicators alone are insufficient unless accompanied by evidence of learning transfer and conceptual understanding (Chan & Lo, 2024).
1.3. Gamified Teaching in Language Learning
Language learning poses distinct demands because progress requires sustained exposure, repeated retrieval, and opportunities for meaningful interaction. Beyond mastery of forms, learners must develop communicative competence, including pragmatic and sociolinguistic appropriateness (Hymes, 1972), which depends on context-sensitive meaning making that has long been emphasized in linguistic anthropology and pragmatics (Malinowski, 1923; Ogden & Richards, 1923). Recent applied linguistics research further reconceptualizes communicative competence as a dynamic, usage-based, and socially situated construct shaped by interactional experience, digital mediation, and learner agency (Taguchi, 2020; Ellis & Shintani, 2023). Consequently, gamified language teaching often combines practice mechanics with communicative tasks in order to motivate repeated engagement while preserving authentic language use (Reinders & Benson, 2017; York et al., 2021).
At the linguistic level, gamified interventions frequently target vocabulary, grammar, and pronunciation through short cycles of retrieval practice, feedback, and progression. Such designs can support noticing and form meaning mapping, which are critical processes in second language development (Schmidt, 1990). A recent systematic review of gamified tools for foreign language learning reports generally positive effects on vocabulary and form-focused outcomes, while highlighting substantial variability associated with task design, intervention duration, and assessment practices (Luo, 2023). In mobile and online environments, these interventions are often delivered through app-based activities that extend practice beyond class time and encourage persistence through micro-goals and streaks (Godwin-Jones, 2014; Peterson, 2012). Macro-skill development has also been explored, with gamified tasks supporting speaking, listening, reading, and writing through structured prompts, peer interaction, and performance feedback, although the strength of effects varies by task complexity and measurement choices (Sykes, J., 2009; Taguchi, 2015). Recent EFL-focused reviews suggest that macro-skill outcomes depend more strongly on instructional scaffolding and teacher mediation than on gamification mechanics alone (Helvich et al., 2023).
A defining challenge for language learning is that motivation and anxiety strongly shape willingness to communicate and sustained effort. The L2 motivation literature emphasizes that engagement is influenced by goals, self-beliefs, and classroom climate (Dörnyei, 2005), while foreign language anxiety research shows that evaluative pressure can reduce participation and performance (Horwitz et al., 1986). This makes the motivational design of gamified language tasks particularly consequential. Competitive mechanics may energize some learners but heighten threat for others, whereas cooperative and scaffolded designs may better support persistence and risk taking in communication (Deci & Ryan, 2000; Pekrun, 2006).
Recent work further suggests a shift toward technology-enhanced gamification that integrates intelligent support and richer interaction channels. Natural language processing and generative models can enable adaptive feedback and automated conversational practice, which may reshape how gamified tasks are personalized and evaluated (Matyakhan et al., 2024; Vaswani et al., 2017). Immersive and multimodal environments can also make interaction more situated, but they introduce new demands for scaffolding and for careful measurement of learner experience (Neville, 2015; Sanjaya & Kastuhandani, 2025; Yildirim & Karahan, 2023). These trends underscore the need to interpret gamification not as a fixed method but as a design space whose effects depend on the coordination of mechanics, tasks, and social contexts.
1.4. Limitations of Existing Reviews
Despite rapid growth, existing reviews of gamification in language education often emphasize cataloguing game elements, platforms, or outcomes, while giving less attention to how research communities form, how topics evolve over time, and where theoretical explanations concentrate or fragment. As a result, the field can appear more coherent than it actually is, and the reasons for inconsistent findings remain difficult to diagnose. In particular, the literature frequently reports motivation and engagement improvements, yet mechanisms and boundary conditions are not consistently specified, which limits cumulative theory building (Deci & Ryan, 2000; Sailer & Homner, 2020). Recent reviews note that while outcome-focused syntheses are abundant, analyses of collaboration patterns, co-citation structures, and thematic trajectories remain relatively scarce (Luo, 2023; Chan & Lo, 2024).
To address these limitations, this study conducts a bibliometric review that maps collaboration patterns, keyword trajectories, and co-citation structures to clarify the intellectual base and emerging frontiers of gamified language instruction. Although bibliometric approaches have begun to appear in this domain, existing analyses remain limited in scope or dataset coverage, underscoring the need for a more comprehensive mapping of the field (Shang, 2025). By providing a descriptive account of what concepts, methods, and application domains dominate the corpus, the results will establish an empirical foundation for the discussion, where the observed patterns will be synthesized into a coherent interpretive account of how gamified language teaching can be theorized, operationalized, and evaluated across contexts.
3. Results
3.1. Subject Distribution Characteristics
A thorough analysis of gamified language teaching research, spanning the last three decades, has been conducted. This analysis was based on subject classification data extracted from the Web of Science database. The results reveal the pronounced interdisciplinary nature of this field. Computer-science–related fields predominate, with “Computer Science, Interdisciplinary Applications” (85 citations), “Computer Science, Artificial Intelligence” (67 citations), and “Computer Science, Information Systems” (61 citations) ranking as the top three. This distribution pattern is indicative of the field's high dependence on computer technology (Deterding et al., 2011; Gee, 2003).
Although there is a general increase in the number of connections (246 nodes and 138 edges, density around 0.0046), the network is still very sparse, indicating low integration and persistent fragmentation. This confirms the early maturity level of the gamified language teaching research community, in which collaborations tend to remain local to small communities. As noted by Glänzel and Schubert (2004), fragmentation is a common phenomenon in emerging fields, where thematic diversity and methodological innovation tend to slow the formation of stable research communities. (Glänzel & Schubert, 2005).
Figure 1.
Subject Classification.
Figure 1.
Subject Classification.
It is noteworthy that certain academic disciplines demonstrate high centrality metrics despite limited publication output. For instance, “Neuroscience” (centrality 0.68) and “Experimental Psychology” (centrality 0.55) act as pivotal connectors between cognitive science, educational technology, and language learning, promoting profound interdisciplinary integration (Mayer, 2014; Plass et al., 2015).
The annual publication volume of research on gamified language teaching has demonstrated sustained growth. Research activity rose significantly in 2005, followed by continuous expansion thereafter. After 2015, a notable surge occurred, especially with the proliferation of mobile learning technologies and adaptive platforms (Koivisto & Hamari, 2019). Following the COVID-19 pandemic, online and hybrid modes further accelerated this trajectory. Mobile language learning apps gained popularity among college students preparing for proficiency exams, where gamification improved both engagement and satisfaction (Dizon, 2016).
Empirical findings show that gamified features such as reward systems and progress visualization effectively sustain motivation and improve retention in self-regulated learning environments (Landers, 2014; Surendeleg et al., 2014). Hence, balancing instructional goals and user experience when embedding game elements has become a central concern in the design of educational technology.
Figure 1.
Publications and Citations.
Figure 1.
Publications and Citations.
3.2. Collaborative Pattern
3.2.1. Author Collaboration Network
Figure 3 illustrates the author collaboration network in gamified language teaching research from 2013 to 2025, generated using CiteSpace. The network comprises 246 authors and 138 co-authorship links, yielding a low density of approximately 0.0046. Such sparsity is characteristic of emerging or interdisciplinary research fields, where stable and large-scale collaborative communities have not yet fully formed (Glänzel & Schubert, 2005; Newman, 2001). To improve interpretability, minimum spanning tree (MST) pruning was applied. This conservative visualization technique removes redundant ties while preserving structurally significant connections, thereby emphasizing the backbone of collaboration rather than incidental co-authorships (C. Chen, 2006). The resulting sparsity should therefore be interpreted as a structural feature rather than an absence of scholarly interaction.
Within this network, some authors such as Guo.et al emerge as prominent hubs (Guo et al., 2022), indicating comparatively higher levels of collaborative engagement, a pattern consistent with bibliometric analyses of technology-enhanced and game-based language learning research that identify them among the most productive and highly connected authors in the field (Zhai et al., 2022).
The concentration of ties around a limited core is consistent with cumulative advantage mechanisms, whereby established researchers attract disproportionate numbers of new collaborations, reinforcing their central positions over time (Newman, 2001). Consequently, network expansion does not necessarily produce greater integration but may instead consolidate influence among core actors. Despite the presence of central authors, the network remains fragmented, with relatively few links connecting different clusters. This fragmentation is typical of developing research domains characterized by thematic diversity and methodological experimentation (Glänzel & Schubert, 2005).
From a network-theoretical perspective, bridging positions are analytically significant, as actors occupying such roles can facilitate information diffusion across otherwise disconnected groups (Burt, 2004; Freeman, 1978). The limited number of bridging links observed suggests that cross-group collaboration remains underdeveloped. Overall, the network reflects a field that is expanding in scale but still consolidating structurally. Strengthening cross-institutional and interdisciplinary collaboration, particularly by fostering bridging roles, may be essential for enhancing long-term coherence and cumulative knowledge development in gamified language teaching research.
Figure 2.
Author Collaboration Network.
3.2.2. Institutional Collaboration Network
At the institutional level, the collaboration network exhibits multi-centered clustering, where several institutions—most notably universities from China, Malaysia, Spain, and the United States—act as regional hubs. This pattern suggests that gamified language teaching has moved beyond isolated pilot projects toward structured institutional cooperation. The visualization from CiteSpace reveals heterogeneous institutional participation, with some universities (e.g., Universiti Kebangsaan Malaysia and The Education University of Hong Kong) forming dense intra-regional clusters, while others engage in transnational collaborations.
Institutional network density remains modest compared to national networks but shows a steady upward trajectory, mirroring trends observed in the broader educational technology domain (Adams, 2013). The emergence of cross-institutional collaboration consortia may be attributed to the global expansion of open-access journals, international conferences, and digital collaboration platforms. These factors lower the entry barriers for joint publications and facilitate resource sharing.
Nevertheless, institutional collaborations in this field often exhibit short-term and project-based characteristics, lacking sustained partnerships that can yield longitudinal datasets or cross-cultural comparative insights. According to Wuchty, Jones, and Uzzi (2007), stable and interdisciplinary institutional teams are crucial for driving innovation and producing high-impact outputs in science (Wuchty et al., 2007). Therefore, the gamified language teaching field should prioritize the establishment of long-term, interdisciplinary institutional alliances to strengthen empirical depth and global comparability.
Figure 3.
Institutional Collaboration Network.
Figure 3.
Institutional Collaboration Network.
3.2.3. National or Regional Collaboration Network
The national and regional collaboration network reveals a more integrated and mature structure than that observed at the author or institutional levels. Major contributors include China, the United States, Spain, Malaysia, and Brazil, forming a globally distributed network with cross-continental linkages. The relatively high network density (~0.0225) suggests that national-level cooperation in gamified language teaching has reached a substantial scale.
The prominence of Asian countries, especially China and Malaysia, underscores the Asia-Pacific rise in digital education research, a shift from the traditional “Western-centric” dominance in earlier educational innovation studies. Yazdi et al. (2024) noted that the global diffusion of gamification research is increasingly characterized by bidirectional knowledge flow between Western and Asian institutions. This transformation corresponds to what Marginson described as the “multipolar globalization” of higher education, where regional hubs increasingly assume leadership roles in international academic discourse (Marginson, 2019).
Although there is a general increase in the number of connections (246 nodes and 138 edges, density around 0.0046), the network is still very sparse, indicating low integration and persistent fragmentation. This confirms the early maturity level of the gamified language teaching research community, in which collaborations tend to remain local to small communities.
Cross-national cooperation also facilitates the transfer of methodological frameworks and gamification models across linguistic and cultural boundaries, promoting comparative analyses and global pedagogical inclusivity. Such initiatives align with Luo (2023), who emphasized the effectiveness of integrating gamified learning tools across varied educational contexts, thereby enhancing both engagement and equity in foreign language education.
Figure 4.
National or Regional Collaboration Network.
3.2.4. Overall Interpretation and Future Directions
Synthesizing the findings from author, institutional, and national networks, it becomes evident that the research community in gamified language teaching is undergoing a transition from fragmentation to integration. The author network remains characterized by a few dominant hubs, while institutional collaborations are gradually diversifying, and national networks demonstrate increasing globalization and inclusivity.
This hierarchical expansion reflects a “bottom-up globalization model”, where individual scholars and local institutions pioneer collaboration, gradually scaling up toward international research frameworks. The bibliometric patterns confirm that the field is evolving toward a more interconnected ecosystem that integrates technology, pedagogy, and cross-cultural perspectives.
Ultimately, the collaborative evolution in gamified language teaching signifies a paradigm shift toward internationalized educational innovation, supported by digital technology and transdisciplinary engagement. Such integration aligns with the broader trajectory of educational research toward networked science, as characterized by large-scale, multi-institutional, and data-driven collaboration models (Sergeeva et al., 2024). The future success of this field thus hinges on cultivating not only individual scholarly excellence but also collective capacity-building across geographic and institutional boundaries.
3.3. Cooccurrence Analysis
3.3.1. Keyword Cooccurrence Pattern
Keyword cooccurrence analysis offers a compact, results-focused view of how gamified language education is being framed across the corpus. The network foregrounds a design-oriented vocabulary that travels with learning outcomes and learner experience. High frequency terms cluster around game elements, motivation, engagement, feedback, assessment, and language skills, suggesting that studies commonly treat gamification as a configurable set of instructional levers rather than a single intervention label (Hung et al., 2018; Zou et al., 2021). In this sense, the co-occurrence map helps specify what the field tends to operationalize when it reports “gamified” learning environments.
First, pedagogical targets are tightly coupled with specific language outcomes. Vocabulary learning, grammar, speaking, and writing appear as recurring anchors, often adjacent to words indexing performance, achievement, and retention. This pattern aligns with classroom and online implementations where game points, levels, and feedback loops are embedded into task sequences and then evaluated via tests or analytic rubrics (Z.-H. Chen et al., 2018; Reinders & Wattana, 2015). The co-occurrence structure therefore indicates that outcome measurement is typically linked to discrete skill targets, with vocabulary and communicative performance appearing as common evaluation foci.
Second, a social and interactional strand is visible, though it is less central than the design and outcome cluster. Keywords related to collaborative learning, peer interaction, and social engagement appear alongside motivation and enjoyment, indicating that many studies consider social structures as part of the gamified setting rather than as an independent object of analysis (Acquah & Katz, 2020; Dehghanzadeh et al., 2021). This strand suggests that the literature often mixes individual progression mechanics with-group based activities, but the network also implies that social features may be under specified compared with points or badges.
Third, methodological terms are positioned near the thematic center, highlighting the routine reliance on quasi-experimental comparisons, surveys, and mixed measures to document engagement and learning effects. Assessment, questionnaire, and experiment co-occur with motivation related keywords, which suggests that behavioral and self-report indicators are frequently combined to support claims about effectiveness (Shortt et al., 2023; Thompson & von Gillern, 2020). Overall, the co-occurrence pattern portrays a research space organized around the intersection of design features, learner experience, and measured language outcomes, which provides a descriptive baseline for the interpretive synthesis developed in the discussion section.
Across the network, several contextual markers co-occur with the dominant design and outcome terms, including classroom, online learning, mobile applications, and higher education. These co-occurrences suggest that many interventions are implemented as short modules within existing courses or platforms, where instructors can control task sequencing but cannot fully standardize exposure outside class. As a result, the keyword map implicitly highlights boundary conditions that later matter for interpretation, such as whether learning is individual or group based, whether activities are competitive or cooperative, and whether the target is memorization-oriented practice or communicative performance. Although these boundaries are not always made explicit in reporting, their repeated proximity to motivation and performance terms indicates that contextual variation is a persistent feature of the empirical landscape (Hung et al., 2018; Zou et al., 2021).
3.3.2. Keyword Emergence Analysis
Keyword emergence analysis complements co-occurrence by identifying terms that show abrupt growth over bounded time windows. The strongest bursts mark shifts in what researchers newly foreground, especially where technology, analytics, and specific learning targets become more salient. Across the recent period, emergent terms indicate that gamified language learning is increasingly discussed in relation to adaptive support, immersive environments, and more fine-grained accounts of learner behavior (Dixon, 2022).
One emerging direction concerns intelligent and adaptive support for learning processes. Bursting terms tied to natural language processing, learning analytics, and personalization suggest that the field is moving beyond static rule-based gamification toward systems that diagnose performance and adjust feedback or task difficulty. This direction also implies a growing interest in data traces as behavioral evidence, for example platform logs that capture persistence, pacing, and response patterns in game-like tasks (Dixon, 2022).
A second direction centers on vocabulary acquisition as a recurring frontier. The appearance of vocabulary-related bursts, together with terms connected to digital flashcards, mobile learning, and contextualized practice, indicates renewed attention to scalable lexical learning designs. The literature also shows a tendency to pair gamified vocabulary activities with motivational measures, suggesting that vocabulary has become a convenient test bed for examining short-term engagement and longer-term retention (Honarzad & Soyoof, 2020; Yudintseva, 2015).
A third direction highlights serious games, learner centered perspectives, and immersive modalities. Bursting terms related to serious games point to a broader alignment with game-based learning beyond superficial reward structures, while recent interest in virtual reality and augmented reality indicates a turn toward richer interaction channels and embodied task settings (Zhang & Hasim, 2023). Studies in this frontier often position motivation and experience as key outcomes alongside language performance, but they vary widely in age groups, task types, and social arrangements, which signals the need for clearer boundary descriptions when interpreting effects (Li & Phongsatha, 2025; Sadigzade, 2025).
Taken together, the co-occurrence and emergence results summarize what themes dominate and what topics are gaining momentum without forcing a mechanism-based interpretation inside the results section. This descriptive map sets up the discussion by clarifying which design elements, learner variables, and outcome indicators recur most often, and where newer technology-enabled strands are beginning to reshape the research agenda.
3.4. Co-Citation Analysis
This chapter reports the co-citation clustering and timeline patterns that structure the intellectual base of gamified language instruction. The following cluster summaries provide a result-oriented map of what the community most frequently co-cites and how the knowledge base has evolved (Alfadil, 2020; Belda-Medina & Calvo-Ferrer, 2022).
#0 digital game-based learning: This is the largest and most cohesive knowledge core. Co-cited works in this cluster anchor gamified language instruction in broader digital game-based learning and CALL traditions, emphasizing that game-like activities can support vocabulary acquisition and skill development when embedded in pedagogically meaningful tasks. The cluster also highlights the gradual shift from proof-of-concept studies toward classroom-grounded implementations and more diversified technologies, including mobile and immersive variants that extend digital play into authentic learning settings (Alawadhi & Abu-Ayyash, 2021; Hwang et al., 2016). Overall, Cluster #0 functions as the foundational reference point to which later clusters connect.
#1 ESL instruction: This cluster concentrates on ESL and EFL instructional applications and the learner experience within formal courses. Cited and citing works emphasize motivational explanation, classroom integration, and the role of task design in sustaining engagement. For example, recent contributions in this cluster draw on self-determination theory to examine how specific game elements may support autonomy, competence, and relatedness in language tasks (Luo, 2023). The cluster also includes empirical work that links gamified activities with engagement and performance outcomes in varied instructional settings, illustrating how effectiveness claims are typically situated within particular learner populations and course constraints (Laksanasut, 2025; Luo, 2023).
#2 research trend: The research trend cluster reflects a meta-level knowledge base that documents how the field describes its own development. It contains bibliometric analyses, mapping studies, and review oriented syntheses that consolidate dominant themes and identify emerging topics. These works provide evidence that gamified language learning has moved from general claims about motivation toward more differentiated questions about design, measurement, and contextual moderators, while also showing rapid growth in mobile and technology enabled learning environments (H.-J. H. Chen & Hsu, 2020; L. Yang et al., 2024). Within this cluster, cited empirical studies are frequently used to exemplify the reported trajectories, including work on VR-supported language learning and teacher education contexts (Alfadil, 2020; Belda-Medina & Calvo-Ferrer, 2022).
#3 mobile game-based learning: This cluster captures the rise of mobile platforms as a distinct line of work. Co-cited studies examine how smartphones, apps, and ubiquitous access reshape gamified practice, often through short, repeated activities that can be aligned with vocabulary, pronunciation, or grammar targets. The cluster reflects a pattern in which mobile delivery is treated as a way to extend exposure and feedback beyond class time, while also introducing new data sources such as interaction logs. The prominence of this cluster indicates that mobile-game-based language learning has become a durable branch of the knowledge base rather than a short lived trend (Z.-H. Chen et al., 2018; Hwang et al., 2016).
#4 game design element: This cluster focuses on the design components that are treated as causal levers in gamified instruction. Cited works in this cluster compare or decompose elements such as points, badges, levels, narrative, and feedback, and they discuss how combinations of elements may shape motivation and learning outcomes. Review-oriented contributions highlight recurring element sets and recommend moving from surface rewards to design rationales that align mechanics with learning tasks (Govender & Arnedo-Moreno, 2021; Su et al., 2021). Empirical studies also underline that design element effects are not uniform and may vary with learner characteristics, task types, and classroom constraints (Al-Dosakee & Ozdamli, 2021; Mee et al., 2021).
#5 motivation students need: This cluster consolidates foundational motivational accounts that are repeatedly used to interpret gamified language learning. Frequently co-cited sources emphasize learner needs, goal-directed participation, and the role of feedback and challenge, providing conceptual grounding for later SDT-based and engagement-based studies. The presence of early CALL and game-mediated language learning work in this cluster shows continuity between gamification and prior digital game traditions, including attention to learner agency and interactional contexts (Zhang et al., 2017). More recent work within the broader network draws on these foundations to explain why similar designs can yield different engagement trajectories across contexts (Anderson & Yantis, 2013; Zhang et al., 2017).
#6 learning performance: This cluster foregrounds how learning outcomes are operationalized and how performance is evaluated in gamified language instruction. Co-cited studies link design features to measurable gains, often combining achievement tests with engagement indicators to justify effectiveness claims. The cluster also signals that performance evaluation is increasingly tied to specific tasks and proficiency levels, rather than global achievement outcomes, which helps explain why results vary across studies. Representative contributions connect gamification with learning performance in language classrooms and online settings while highlighting the importance of outcome alignment and measurement quality (Hung et al., 2018; Lee, 2019; Wen, 2018; J. C. Yang et al., 2018).
#8 cognitive scaffolding model: This cluster highlights the growing role of cognitive support in gamified instruction, including scaffolding through task progression, feedback, and contextual cues. Co-cited works connect game-like environments with structured guidance that can reduce cognitive burden and sustain productive practice, especially in simulation and immersive tasks (Berns et al., 2016; Cornillie et al., 2012). Reviews and meta-theoretic accounts in this cluster emphasize design principles for serious games and educational games, and they position scaffolding as a bridge between motivational appeal and learning effectiveness (Ke, 2016; Krath et al., 2021). Recent work also connects scaffolding to measurable learner behavior and engagement in technology-mediated language learning (Bahari, 2022; Gamlo, 2019; Peterson, 2023).
Taken together, the co-citation clusters and their timeline signals describe a knowledge base that is simultaneously expanding and differentiating. The evidence indicates a stable foundation in digital game-based learning, growing specialization in mobile delivery and design element research, and increased attention to motivation, scaffolding, and performance measurement. These results provide the empirical basis for the next chapter, which synthesizes the descriptive patterns into a coherent interpretive discussion of how gamified language instruction has been theorized, operationalized, and evaluated across contexts.
4. Discussion
4.1. Summary of Key Findings and Trends
Based on the results, the bibliometric evidence portrays gamified language instruction as technology-enabled, theoretically diverse, and still consolidating. In the section 3.1, the subject distribution characteristics analysis shows that the field is anchored in applied computer science, artificial intelligence, and information systems, which provide the platforms and data pipelines that make gameful interventions measurable, including learning analytics for tracing behavioral patterns and learning processes over time (Pardo & Siemens, 2014; Siemens, 2013). Education, psychology, and linguistics supply the explanatory language for interpreting engagement and learning, especially motivation. Two trajectories increasingly intersect: instrumentation for modeling and prediction (Pardo & Siemens, 2014; Siemens, 2013) and theory driven evaluation of motivational and learning effects grounded in game motivation research (Deci & Ryan, 2000; Ryan et al., 2006). Overall, growth reflects a shift toward specifying how gameful structures organize cognition, affect, and practice in ways that matter for language development (Plass et al., 2015).
In the section 3.2, collaborative pattern analysis adds a complementary view at the community level. The collaboration networks show a stable core-periphery pattern, with a small set of hub authors and institutions and many short-lived, locally organized clusters. Collaboration is expanding across China and Malaysia with international links, suggesting globalization without fully stabilized research communities. National networks are denser, reflecting the rise of Asia Pacific regions in digital education and gamification and the broader shift toward multipolar knowledge production (Zhou & Leydesdorff, 2006). At the same time, many ties remain project-based, limiting longitudinal tracking, cross-cultural comparison, and shared methodological standards. The network evidence therefore points to the need for stronger coordination, such as shared datasets, aligned measures, and replication-oriented designs, to turn expansion into cumulative knowledge.
In the section 3.3, cooccurrence analysis clarifies what the community is converging on by mapping the thematic core and its internal hierarchy. The keyword network centers on language learning, game-based learning, motivation, and performance, indicating a dominant agenda that links design decisions to motivational processes and measurable outcomes. Within this core, one pole emphasizes definition and design. Gamification is repeatedly framed as introducing game design elements into learning contexts, distinguishing gamified instruction from full games and other digital tools (Deterding et al., 2011). This pole also treats gamification as a systematic approach for structuring participation, feedback, and progression (Kapp, 2012). As the field matures, attention shifts from listing elements to testing which combinations in which contexts yield reliable motivational and behavioral effects, echoing calls for better controlled studies and tighter links between specific elements and mechanisms (Dicheva et al., 2015). In short, design is treated as a causal hypothesis about how mechanics shape learning behavior.
The second thematic pole in 3.3 concerns motivational explanation and its empirical variability. The prominence of self-determination theory suggests that competence, autonomy, and relatedness are viewed as pathways through which gameful features support engagement and persistence (Deci & Ryan, 2000). Syntheses also caution against assuming uniform benefits, showing that positive average relations often coexist with mixed results depending on implementation, element selection, and learner population (Hamari et al., 2016; Koivisto & Hamari, 2019). Meta-analytic evidence similarly reports small positive effects on cognitive, motivational, and behavioral outcomes while emphasizing that effect stability depends on study rigor and contextual moderators such as social interaction and narrative (Sailer & Homner, 2020). These patterns imply that motivation is a useful bridge only when paired with explicit design specifications and contextual constraints, which helps explain the continued use of SDT for generating testable hypotheses about need satisfaction, engagement quality, and persistence (Deci & Ryan, 2000; Ryan et al., 2006).
A further trend from 3.3 is the increasing granularity of outcome focus and the diversification of research frontiers. Timeline patterns suggest movement from early feasibility work toward classroom effectiveness studies, followed by newer directions that integrate immersive technologies such as virtual reality and augmented reality. Vocabulary and performance terms remain salient, indicating a shift from broad attitudinal claims toward skill-specific assessment and mechanism-oriented measurement. Alfadil’s quasi-experimental study illustrates this turn by showing stronger vocabulary gains in a virtual reality game condition and linking in world interaction behaviors to vocabulary performance (Alfadil, 2020). The thematic evolution also highlights growing interest beyond formal classrooms, including mobile and online participation settings for L2 learning (Godwin-Jones, 2014; Reinders & Wattana, 2015) and out-of-school gaming and authentic multiplayer contexts that widen motivational and sociocultural considerations (Hung et al., 2018).
In the section 3.4, co-citation analysis consolidates these signals by revealing a knowledge core with several application branches in the co-citation clusters. The largest cluster, digital game-based learning, integrates research on outcomes, mechanisms, and instructional design, linking early serious game work to later second language teaching practices (Connolly et al., 2012; Wouters et al., 2013). Clusters on ESL instruction and mobile assisted learning show that application contexts now organize knowledge development. A cluster focused on game design elements connects feedback, rewards, narrative, and social interaction to motivational accounts, particularly SDT, as mechanisms supporting autonomy, competence, and relatedness (Deci & Ryan, 2000). Review and commentary nodes remain influential, indicating that synthesis helps define subfields, standardize terminology, and identify gaps in design evidence and measurement. Overall, the co-citation structure reflects a stable motivational core, multiple contextual branches, and an expanding capacity for process tracking.
Taken together, the three analytic strands converge on an emerging integrative logic for the field. Technological infrastructures and analytics expand what can be observed and optimized (Pardo & Siemens, 2014; Siemens, 2013), collaboration patterns show how knowledge is produced and where alignment remains limited (Zhou & Leydesdorff, 2006), and thematic and co-citation structures position motivation and design as the bridge between gameful experiences and language outcomes (Deci & Ryan, 2000; Hung et al., 2018). These trends suggest that future theorizing will be most productive when it links the learning environment, the psychological mechanisms that translate design into sustained engagement, and language outcomes that can be assessed at both general and skill-specific levels. Thus, the bibliometric evidence supports a structured account of how designable features connect to mechanisms and assessment, rather than treating gamification as a unitary treatment effect.
4.2. Towards a Theoretical Framework
Figure 5.
Integrated Technology, Psychology, and Pedagogical Framework for Gamified Language Learning.
Figure 5.
Integrated Technology, Psychology, and Pedagogical Framework for Gamified Language Learning.
The framework in
Section 4.2 is derived from the bibliometric synthesis rather than proposed as a detached theory. We interpret the co-citation structure, keyword bursts, and thematic clusters as repeated signals about what the field designs, measures, and explains. These signals are consolidated into three modules that connect technology-mediated learning conditions to psychological and behavioral processes and then to outcomes. In addition, the framework is compatible with learning analytics perspectives that emphasize trace data and ethically grounded measurement (Siemens, 2013; Pardo & Siemens, 2014). It also responds to calls for theory driven gamification research that distinguishes elements, mechanisms, and outcomes (Dicheva et al., 2015) and to foundational definitions and theory building work in gamification (Deterding et al., 2011; Hamari et al., 2016; Landers, 2014).
Table 1.
Signals from Bibliometric Mapping that Inform the Three-Module Framework.
Table 1.
Signals from Bibliometric Mapping that Inform the Three-Module Framework.
| Analysis |
Implications for the Technology-mediated learning environment module |
Implications for the Psychological mechanism’s module |
Implications for the Learning outcomes module |
| Collaboration analysis |
Highlights cross-platform and cross disciplinary experimentation, suggesting the environment should be specified in terms of platform, modality, and data capture. |
Reveals heterogeneous construct use across teams, motivating mechanism definitions that travel across contexts. |
Suggests a need for shared measurement conventions so that outcomes are comparable and cumulative. |
| Keyword co-occurrence and bursts |
Shows rising attention to mobile, VR, AR, and analytics, reinforcing the environment as a design and instrumentation layer. |
Signals the centrality of motivation and engagement, but also points to parallel mechanisms such as social comparison, achievement goals, self-efficacy beliefs, reinforcement from feedback, and affect regulation. This supports a mechanism comparison approach rather than a single SDT pathway. |
Emphasizes performance and skill-specific outcomes such as vocabulary and proficiency, encouraging aligned assessments. |
| Reference co-citation and timeline analysis |
Connects design work in game-based learning and digital platforms, supporting environment features such as feedback, progression, and interaction logs. |
Clusters around motivational and instructional theories, indicating a shift from element lists to explanations that can compare SDT with adjacent behavioral mechanisms and specify moderators. |
Aggregates evidence from reviews and meta- analyses, motivating outcome definitions and the move toward rigorous evaluation. |
Module 1, the technology-mediated learning environment, specifies what is implemented and what learners actually experience. It covers the choice and configuration of game elements, the language tasks they wrap, and the platform features that shape feedback timing, social visibility, and interaction constraints. In this module, gamification is treated as a set of designable features whose meaning depends on implementation details, not as a unitary treatment (Deterding et al., 2011). To support mechanism testing, Module 1 also clarifies what data can be captured, including in platform logs and classroom instrumentation, and what analytic practices are appropriate for educational contexts (Kapp, 2012; Siemens, 2013). Ethical and privacy considerations are embedded rather than appended, since data richness does not substitute for valid and responsible measurement (Pardo & Siemens, 2014).
Module 2, psychological mechanisms, explains why similar designs can yield divergent trajectories across learners and contexts. SDT provides a baseline account of autonomy, competence, and relatedness support in gamified learning (Deci & Ryan, 2000; Sailer & Homner, 2020). Our synthesis also highlights mechanisms that are frequently implied but not always explicitly modeled, including social comparison elicited by public rankings (Festinger, 1954), performance and mastery goal regulation (Elliot & McGregor, 2001), shifts in self-efficacy expectations (Bandura, 1997), and affective routes such as anxiety and achievement emotions that can facilitate or suppress participation (Horwitz et al., 1986; Pekrun, 2006). Treating these as competing pathways encourages clearer hypothesis specification and more discriminating measurement.
Module 3, learning outcomes, clarifies what counts as evidence and how outcomes should be interpreted relative to mechanisms. Reviews of gamification and digital game-based learning report mixed effects on achievement, especially when engagement is assumed rather than measured or when extraneous rewards crowd out task focus (Connolly et al., 2012; Dicheva et al., 2015; Wouters et al., 2013). For language learning, outcomes can include performance indicators such as vocabulary gain and speaking fluency, but also behavioral indicators such as time-on-task, persistence, and interaction frequency that provide closer tests of mechanism activation (Reinders & Wattana, 2015). Recent syntheses in language education similarly suggest that technology supported interventions are most persuasive when they report both learning and process measures (Alfadil, 2020). At the design level, the module emphasizes that feedback must be informative and aligned with learning goals (Erhel & Jamet, 2013).
Taken together, the three modules form an explanatory chain from design conditions to mechanisms to pedagogical outcomes, which directly addresses the heterogeneity observed in both bibliometric patterns and empirical reviews. Module 1 defines the manipulable design space, Module 2 specifies causal processes, and Module 3 provides pedagogical outcome targets and measurement strategies. This chain encourages researchers to test whether a given element changes the intended mechanism before interpreting null or mixed learning effects, a logic that is compatible with learning analytics and theory driven gamified learning accounts (Landers, 2014; Siemens, 2013). It also supports comparative theorizing: the same element can operate through different mechanism pathways depending on context, for example autonomy support in one setting and competitive social comparison in another (Hamari et al., 2014; Sailer & Homner, 2020). Finally, the framework underscores that robust claims require triangulation across design traces, mechanism measures, and outcome assessments, rather than relying on single end point tests (Dicheva et al., 2015). In line with prior evidence, it treats well aligned feedback as a core design constraint for sustaining learning rather than substituting for it (Erhel & Jamet, 2013).
4.3. Insights for Behavioral Mechanisms
Table 2 operationalizes the three-module model proposed in
Section 4.2 by translating gamification design conditions (Module 1) into competing, psychological mechanisms (Module 2) and their expected behavioral signatures and learning outcomes (Module 3), while specifying the boundary conditions under which each pathway is most likely to hold. This mechanism centered synthesis reorients the discussion from cataloging design elements to explaining why, when, and for whom they change observable learning behaviors. For instance, a leaderboard may evoke social comparison and performance goal orientation, leading to higher practice frequency or persistence for some learners, yet it can also increase threat appraisal and reduce participation among novices or anxious learners, illustrating how the same design can operate through different mechanism pathways depending on context.
First, design decisions should be guided by mechanism alignment rather than element accumulation. The same feature can recruit different processes across contexts, and mismatched processes often explain null or mixed effects in the empirical record. For instance, points may function as informational feedback that supports competence, or as controlling rewards that redirect attention to external payoff (Landers, 2014; Ryan & Deci, 2017). Before selecting features, teachers can specify a target behavior such as sustained voluntary practice, willingness to communicate, or collaborative help seeking, and then choose conditions that make the intended mechanism dominant. This implies attending to boundary conditions including learner age, proficiency, task structure, social grouping, and whether the design emphasizes competition or cooperation.
Second, self-determination theory is most useful as a baseline account of need support, but it should be embedded in a mechanism comparison framework. In gamified language learning, features commonly operate through social comparison (Festinger, 1954), achievement goal framing (Elliot & McGregor, 2001), self-efficacy beliefs (Bandura, 1997), expectancy value appraisal (Eccles & Wigfield, 2002), reinforcement learning from feedback (Skinner, 1965; Sutton & Barto, 2018), habit formation through repetition and cue consistency (Lally et al., 2010), and emotion dynamics such as control value appraisals (Pekrun, 2006). A mechanism map prevents over attributing effects to need satisfaction and helps interpret divergent outcomes, for example when leaderboards raise effort for confident learners via comparison and performance goals, yet reduce participation for novices or anxious learners via threat appraisal.
Third, mechanism claims should be evaluated with behavioral signatures. If a design is intended to support autonomy and value, it should increase self-initiated engagement, optional practice, and deeper strategy use. If it is intended to leverage reinforcement, it should increase immediate attempts, repetition rates, and responsiveness to feedback. If it leverages habit formation, it should increase consistency across days and reduce dropout over time. For speaking tasks, reductions in foreign language anxiety should be reflected in longer speech duration, more turns taken, and higher willingness to communicate (Bandura, 1997; Horwitz et al., 1986). Accordingly, classroom evaluations can combine learning outcomes with process data such as time-on-task, spacing and frequency of practice, interaction logs, and behavioral indicators of persistence, alongside brief mechanism proximal measures.
Finally, the framework highlights pedagogical tradeoffs that depend on social and emotional boundary conditions. Competitive structures can be productive when learners have sufficient skill, criteria are transparent, and failure is low cost, but they can intensify anxiety and discourage lower ranked learners. Cooperative missions and peer feedback may better sustain relatedness and persistence in mixed ability groups, especially in communicative tasks where social risk is salient. The practical implication is not to avoid gamification, but to align design conditions with the most plausible mechanism, articulate the expected behavioral signature, and monitor whether the classroom pattern matches the hypothesized process. This approach connects directly to the integrated framework and supports cumulative, mechanism-based refinement of gamified language instruction.
Table 2.
Mechanism Comparison Map for Gamified Language Learning.
Table 2.
Mechanism Comparison Map for Gamified Language Learning.
| Design condition |
Primary mechanism(s) |
Expected behavioral signature |
Boundary conditions |
Suggested measures |
| Points with informative feedback |
Competence support; reinforcement (Ryan & Deci, 2017; Skinner, 1953) |
More attempts and voluntary repetitions; faster adjustment after feedback |
If framed as controlling reward, may reduce autonomy and exploration |
Attempts, repeats, response time, error correction rate |
| Badges and mastery levels |
Achievement goals; competence (Elliot & McGregor, 2001) |
Persistence on challenging items; selection of higher difficulty |
Performance framing can trigger avoidance for lower performers |
Retries, optional challenge uptake, task choice patterns |
| Leaderboards or rank |
Social comparison; performance goals (Festinger, 1954) |
Increased effort for some; withdrawal for others |
Risk for novices and anxious learners; consider team ranking |
Participation rate, dropout, effort proxies, brief affect check |
| Quests, narrative, meaningful choices |
Autonomy support; task value (Ryan & Deci, 2017; Eccles & Wigfield, 2002) |
Self-initiated engagement; deeper exploration and elaboration |
Story must align with language objectives and time constraints |
Time on optional content, strategy indicators, value ratings |
| Cooperative missions and peer feedback |
Relatedness; social interdependence (Ryan & Deci, 2017) |
More interaction turns, help seeking, and peer scaffolding |
Group composition and norms matter; manage free riding |
Turn taking, messages, peer feedback counts, collaboration quality |
| Streaks and daily challenges |
Habit formation; reinforcement (Lally et al., 2010) |
Higher practice regularity and lower attrition across weeks |
Can induce pressure; allow recovery options and flexible goals |
Days active, spacing, streak length, return rate |
| Low threat speaking simulation |
Self-efficacy; anxiety regulation (Bandura, 1997; Horwitz et al., 1986; Pekrun, 2006) |
Longer speech duration; more turns; higher willingness to communicate |
Avoid public ranking; calibrate difficulty and evaluative cues |
Speech duration, turns taken, WTC ratings, anxiety short scale |
5. Conclusion
This review integrates bibliometric mapping with a mechanism-oriented synthesis to clarify how gamified language learning has evolved and what drives its effects. Across the corpus, research has shifted from descriptive reports of engagement to theory informed explanations that link specific design conditions to observable learning behaviors and outcomes. The co-citation structure and keyword trends suggest that self-determination theory remains a dominant lens, yet the evidence also supports multiple pathways through social comparison, achievement goals, expectancy value beliefs, self-efficacy, reinforcement processes, habit formation, and emotion regulation. By organizing these pathways in
Table 2, the paper extends the three-module model by specifying testable links from game elements to mechanisms and to behavioral signatures such as practice frequency, persistence, participation, help seeking, and dropout. Overall, the findings indicate that effectiveness is contingent, with social structure, task demands, and learner characteristics shaping which mechanisms are activated and whether outcomes are beneficial.
Several limitations should be noted. First, the bibliometric analysis relies on database coverage and indexing quality, so relevant studies in regional venues or non-indexed outlets may be underrepresented, and citation-based structures may lag behind emerging work. Second, heterogeneity in study designs, outcome measures, and reporting practices limits direct comparability across experiments, especially when engagement is operationalized through self-report rather than behavioral logs. Third, mechanism inference is often indirect because many primary studies do not measure mediators and moderators in a unified way, which constrains causal interpretation and may inflate the prominence of popular theories. Fourth, publication bias and English language dominance may skew conclusions toward positive and Western samples.
Future research should prioritize mechanism testing through designs that jointly measure targeted mediators, boundary conditions, and process-level behavioral data. Comparative experiments can pit competing mechanisms against each other by manipulating specific elements, such as leaderboards versus cooperative goals, while tracking persistence, strategy use, and interaction patterns. Researchers should also standardize outcome reporting by combining learning performance with behavioral traces and validated affective measures, enabling stronger meta-analytic synthesis. Finally, work in language learning contexts should examine longer time horizons, including habit formation and transfer beyond the intervention, and should attend to equity by analyzing how gamification interacts with proficiency level, anxiety, and access to technology.