Submitted:
23 May 2026
Posted:
25 May 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Literature Review
2.1. Navigation Technology and Cognitive Offloading: The GPS Precedent
2.2. Metacognition: Components, Development, and Educational Impact
2.3. AI in Education and the Risk of Metacognitive Disuse
2.4. Design Principles for Metacognitive Scaffolding in AI Tools
3. Methodology
3.1. Research Design and Rationale
3.2. Literature Search and Selection
3.3. Data Extraction and Quality Assessment
3.4. Analytical Framework
3.5. Derivation of Design Principles
3.6. Trustworthiness and Rigour
3.7. Methodological Limitations
4. Discussion: From GPS to Compass - A Framework for Metacognitive AI Design
4.1. Synthesis of Cross-Domain Evidence
4.2. Five Evidence-Informed Design Principles
| Principle | Description | Metacognitive Phase Supported | Primary Risk Addressed | Key Supporting Evidence |
|---|---|---|---|---|
| 1. Ask before you act | Prompt the learner to articulate a plan or initial strategy before the AI offers assistance. | Planning | Bypassed strategy selection; passive reception of AI-generated steps. | Azevedo & Gašević (2019); Chi (2000) |
| 2. Delay and fade feedback | Provide graduated hints rather than full solutions; reduce support over time as competence grows. | Monitoring | Suppression of error detection; learned helplessness without the tool. | Butler & Winne (1995); Renkl & Atkinson (2003) |
| 3. Mandatory reflection pauses | Insert structured post-task prompts that ask learners to evaluate their process and identify difficulties. | Evaluation | Skipped consolidation; failure to update strategic knowledge. | Wise & Hsiao (2019); Roelle et al. (2017) |
| 4. Make the process visible | Reveal the AI’s reasoning path, decision points, and alternatives considered; invite comparison with the learner’s own thinking. | Monitoring and evaluation | Fluency illusion; shallow understanding of the solution. | Rozenblit & Keil (2002); VanLehn (2011) |
| 5. Confidence calibration | Ask learners to rate their confidence before revealing answers; track calibration over time and feed it back. | Monitoring (self-assessment accuracy) | Overconfidence; illusion of explanatory depth. | Dunlosky & Rawson (2012); Hacker et al. (2008) |
4.3. The Compass Metaphor and Its Design Implications
4.4. Implications for Practice, Policy, and Future Research
5. Limitations and Recommendations
5.1. Limitations
5.2. Recommendations for Future Research
- Longitudinal and experimental studies. Researchers should track cohorts of learners who use AI tools with varying degrees of metacognitive scaffolding over extended periods (ideally, an academic year or longer). Outcome measures should include not only immediate task performance but independent problem-solving ability, calibration accuracy, and transfer performance, measured both with and without AI assistance. Randomised controlled trials comparing AI tools that embed the five design principles against standard AI interfaces would test the compass model directly.
- Neural and cognitive process measures. Cognitive offloading to AI should be studied using neuroimaging (fMRI, fNIRS) and process-tracing methods (eye tracking, log-file analysis) to examine whether prolonged AI use is associated with changes in prefrontal activity patterns during self-regulated learning tasks. Such studies could determine whether cognitive disuse atrophy is occurring at a neural level, analogous to the hippocampal findings in the GPS literature (Dahmani & Bohbot, 2017).
- Developmental and individual differences research. The effects of AI offloading are unlikely to be uniform across all learners. Studies should investigate how prior knowledge, executive function, age, and metacognitive baseline skill moderate the impact of AI assistance. This would inform the design of adaptive AI systems that calibrate the level of cognitive friction to the learner’s current capacity.
- Design-based implementation research. The five principles require testing in authentic classroom settings, with teacher mediation and peer collaboration. Design-based research partnerships between universities, schools, and EdTech developers can refine the principles into practical, context-sensitive implementation protocols and generate professional development resources for educators.
5.3. Recommendations for Practice and Policy
5.4. Limitations of the Recommendations
6. Conclusions
References
- Azevedo, R.; Gašević, D. Analyzing multimodal multichannel data about self-regulated learning with advanced learning technologies: Issues and challenges. Comput. Hum. Behav. 96 2019, 207–210. [Google Scholar] [CrossRef]
- Bastani, H.; Bastani, O.; Sungu, A.; Ge, H.; Kabakcı, Ö.; Mariman, R. Generative AI without guardrails can harm learning: Evidence from high school mathematics. Proc. Natl. Acad. Sci. 2025, 122(26), e2422633122. [Google Scholar] [CrossRef]
- Bjork, E.L.; Bjork, R.A. Making things hard on yourself, but in a good way: Creating desirable difficulties to enhance learning. In Psychology and the real world: Essays illustrating fundamental contributions to society; Gernsbacher, M. A., Pew, R.W., Hough, L.M., Pomerantz, J. R., Eds.; Worth Publishers, 2011; pp. 56–64. [Google Scholar]
- Butler, D.L.; Winne, P.H. Feedback and self-regulated learning: A theoretical synthesis. Rev. Educ. Res. 1995, 65(3), 245–281. [Google Scholar] [CrossRef]
- Chi, M.T.H. Self-explaining expository texts: The dual processes of generating inferences and repairing mental models. In Advances in instructional psychology; Glaser, R., Ed.; Lawrence Erlbaum, 2000; Vol. 5, pp. 161–238. [Google Scholar]
- Dahmani, L.; Bohbot, V.D. Habitual use of GPS negatively impacts spatial memory during self-guided navigation. Sci. Rep. 7 2017, 41128. [Google Scholar] [CrossRef]
- Dunlosky, J.; Rawson, K.A. Overconfidence produces underachievement: Inaccurate self evaluations undermine students’ learning and retention. Learn. Instr. 2012, 22(4), 271–280. [Google Scholar] [CrossRef]
- Flavell, J.H. Metacognition and cognitive monitoring: A new area of cognitive-developmental inquiry. Am. Psychol. 1979, 34(10), 906–911. [Google Scholar] [CrossRef]
- Gentner, D.; Markman, A.B. Structure mapping in analogy and similarity. Am. Psychol. 1997, 52(1), 45–56. [Google Scholar] [CrossRef]
- Hacker, D.J.; Bol, L.; Bahbahani, K. Explaining calibration in classroom contexts: The effects of incentives, reflection, and explanatory style. Metacognition Learn. 2008, 3(2), 101–121. [Google Scholar] [CrossRef]
- Hattie, J. Visible learning: A synthesis of over 800 meta-analyses relating to achievement; Routledge, 2009. [Google Scholar]
- Hong, Q.N.; Pluye, P.; Fàbregues, S.; Bartlett, G.; Boardman, F.; Cargo, M.; Dagenais, P.; Gagnon, M.P.; Griffiths, F.; Nicolau, B.; O’Cathain, A.; Rousseau, M.C.; Vedel, I. Mixed Methods Appraisal Tool (MMAT) version 2018. Educ. Inf. 2018, 34(4), 285–291. [Google Scholar] [CrossRef]
- Ishikawa, T.; Fujiwara, H.; Imai, O.; Okabe, A. Wayfinding with a GPS-based mobile navigation system: A comparison with maps and direct experience. J. Environ. Psychol. 2008, 28(1), 74–82. [Google Scholar] [CrossRef]
- Lakoff, G.; Johnson, M. Metaphors we live by; University of Chicago Press, 1980. [Google Scholar]
- Lincoln, Y.S.; Guba, E.G. Naturalistic inquiry; Sage, 1985. [Google Scholar]
- Maguire, E.A.; Woollett, K.; Spiers, H.J. London taxi drivers and bus drivers: A structural MRI and neuropsychological analysis. Hippocampus 2006, 16(12), 1091–1101. [Google Scholar] [CrossRef] [PubMed]
- Moher, D.; Liberati, A.; Tetzlaff, J.; Altman, D.G.; The PRISMA Group. Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. PLoS Med. 2009, 6(7), e1000097. [Google Scholar] [CrossRef]
- Prather, J.; Reeves, B.N.; Denny, P.; Becker, B.A.; Leinonen, J.; Luxton-Reilly, A.; Powell, G.; Finnie-Ansley, J.; Savelka, J. The robots are coming: On the potential of ChatGPT for computing education. In Proceedings of the 2024 Innovation and Technology in Computer Science Education V. 1 (ITiCSE 2024)., 2024; Association for Computing Machinery. [Google Scholar] [CrossRef]
- Renkl, A.; Atkinson, R.K. Structuring the transition from example study to problem solving in cognitive skill acquisition: A cognitive load perspective. Educ. Psychol. 2003, 38(1), 15–22. [Google Scholar] [CrossRef]
- Risko, E.F.; Gilbert, S.J. Cognitive offloading. Trends Cogn. Sci. 2016, 20(9), 676–688. [Google Scholar] [CrossRef]
- Roelle, J.; Schmidt, E.M.; Buchau, A.; Berthold, K. Effects of informing learners about the dangers of cognitive offloading on their offloading behavior and achievement. J. Educ. Psychol. 2017, 109(7), 971–987. [Google Scholar]
- Rozenblit, L.; Keil, F. The misunderstood limits of folk science: An illusion of explanatory depth. Cogn. Sci. 2002, 26(5), 521–562. [Google Scholar] [CrossRef]
- Schraw, G.; Dennison, R.S. Assessing metacognitive awareness. Contemp. Educ. Psychol. 1994, 19(4), 460–475. [Google Scholar] [CrossRef]
- Schraw, G.; Moshman, D. Metacognitive theories. Educ. Psychol. Rev. 1995, 7(4), 351–371. [Google Scholar] [CrossRef]
- Snyder, H. Literature review as a research methodology: An overview and guidelines. J. Bus. Res. 104 2019, 333–339. [Google Scholar] [CrossRef]
- Thomas, J.; Harden, A. Methods for the thematic synthesis of qualitative research in systematic reviews. BMC Med. Res. Methodol. 8 2008, 45. [Google Scholar] [CrossRef] [PubMed]
- Torraco, R.J. Writing integrative literature reviews: Guidelines and examples. Hum. Resour. Dev. Rev. 2005, 4(3), 356–367. [Google Scholar] [CrossRef]
- Tyndall, J. AACODS checklist for appraising grey literature. In Flinders University; 2010; Available online: https://dspace.flinders.edu.au/xmlui/handle/2328/3326.
- VanLehn, K. The relative effectiveness of human tutoring, intelligent tutoring systems, and other tutoring systems. Educ. Psychol. 2011, 46(4), 197–221. [Google Scholar] [CrossRef]
- Veenman, M.V.J.; Van Hout-Wolters, B.H.A.M.; Afflerbach, P. Metacognition and learning: Conceptual and methodological considerations. Metacognition Learn. 2006, 1(1), 3–14. [Google Scholar] [CrossRef]
- Whittemore, R.; Knafl, K. The integrative review: Updated methodology. J. Adv. Nurs. 2005, 52(5), 546–553. [Google Scholar] [CrossRef] [PubMed]
- Wise, A.F.; Hsiao, Y.T. Self-regulation in online discussions: Aligning data streams to model and support normative behaviors. In Proceedings of the 9th International Conference on Learning Analytics & Knowledge; Association for Computing Machinery, 2019; pp. 220–229. [Google Scholar] [CrossRef]
- Zheng, L.; Li, X.; Chen, F. Effects of metacognitive prompts on learning outcomes in e-learning environments: A meta-analysis. J. Comput. Assist. Learn. 2019, 35(2), 179–191. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).