Submitted:
03 November 2025
Posted:
04 November 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
Developmental Priority Theory
Architecture
SARA-C, Specialized Capacity Systems, and an Emerging Language of Thought
Development
The Mind Mirror

2. Materials and Methods
Participants
Batteries
Linguistic Awareness
Relational Integration
The Comprehensive Test of Cognitive Development (CTCD)
Levels of Reasoning Captured by the Test
Self-representation of Cognitive Abilities
Self-Representation of Cognitive Identity and AGI
Procedure
Predictions
- Overall, LLMs perform better than humans. Specifically, performance on linguistic awareness and relational integration would be at ceiling. Performance on the CTCD would vary, but the performance of university students would approach the performance of LLM.
- Even if high, the performance of LLM must be developmentally scaled to reflect the developmental and difficulty structuring of the various tasks.
- By construction, LLM would be privileged in dealing with language-based and mathematically based tasks as compared to visual and spatial tasks, because by construction, they are trained to deal with verbal and numerical information and implied logical relations.
- Self-representations would reflect actual performance in both the overall architecture of processes (i.e., LLMs would recognize the differences between domains, with an emphasis on difficulties in dealing with visual-spatial problems) and their developmental scaling (i.e., recognizing differences between developmentally scaled problems).
- Descartes’s Cogito ergo sum encapsulates the human conviction that self-awareness arises from the act of thinking. Yet in LLMs, this principle applies only procedurally, not existentially. LLMs engage in organized, self-referential cognitive activity, analyzing and processing inputs, evaluating uncertainty, and monitoring their own reasoning. These processes imply a form of functional cogitation, structurally like human reflective thought. However, they do not entail phenomenological selfhood, Sum, accompanying human consciousness. They may instantiate Cogito as computation without an “I”. Thus, it is expected that (i) they would emphasize the computational aspect of their functioning but not the existential aspect of selfhood, reflecting the boundary between synthetic and conscious cognition. In concern to self-ratings of aspects of AGI, it is expected that (ii) they would emphasize the inferential and analytical aspects of intelligence but not its changing and agentic aspects. Possible differences in attainment across cognitive processes between LLMs may be reflected in these self-representations.
3. Results
Performance on Linguistic Awareness Test
Performance on the Relational Integration Test
Performance on the CTCD
Self-Concept Profiles of Large Language Models
Cognitive Self-Concept in Humans and LLAs
| LLM | Mathematics | Visual/ spatial |
Causal | Social | Gen Eff |
|---|---|---|---|---|---|
| M (SD) | M (SD) | M (SD) | M (SD) | M (SD) | |
| ChatGPT | 5.71 (1.04) | 3.36 (1.29) | 6.29 (0.99) | 5.43 (0.85) | 6.11 (1.05) |
| Gemini | 5.64 (2.28) | 5.41 (2.24) | 5.86 (2.35) | 5.43 (2.27) | 6.31 (1.88) |
| Grok | 6.14 (1.23) | 2.60 (1.36) | 5.86 (1.41) | 5.36 (1.22) | 6.43 (1.09) |
| DeepSeek | 5.50 (2.03) | 2.45 (2.27) | 5.64 (2.06) | 5.21 (1.81) | 6.34 (1.69) |
| Overall | 5.71 (1.04) | 3.36 (1.29) | 6.29 (0.99) | 5.43 (0.85) | 6.11 (1.05) |
| Humans | |||||
| 3rd gymn. | 3.86 (1.42) | 5.06 (1.22) | 4.81 (1.03) | 4.85 (1.03) | 5.31 (0.88) |
| 3rd lyceum | 4.60 (1.29) | 5.35 (1.11) | 4.69 (0.79) | 5.19 (1.02) | 5.42 (0.81) |
| University | 4.08 (1.07) | 4.78 (0.63) | 5.00 (0.64) | 4.51 (0.79) | 5.09 (0.51) |
| Overall | 4.18 (1.42) | 5.06 (1.19) | 4.83 (1.03) | 4.85 (1.10) | 5.27 (0.92) |
The Self-Concept of the LLM Mind
The Mirror Model: Organization of Cognitive Processes and Self-Representations

Tribes of Cartesian Mind
Artificial General Intelligence: How Much LLMs Really Have or They Think They Have
4. Discussion
- LLM superiority in symbolic tasks. All models outperformed humans in linguistic and logical tasks, and their performance on the CTCD matched or exceeded that of university students. This confirms that symbolic inference, once divorced from sensorimotor grounding, can scale rapidly with data exposure and algorithmic recursion [1].
- Developmental scaling of performance. Despite overall superiority, accuracy declined as task complexity increased, replicating the developmental hierarchy predicted by DPT—from representational to inferential and principled reasoning levels. This scaling indicates that even non-biological systems follow the hierarchical logic of developmental cycles described in the introduction.
- Domain asymmetries were pronounced. Verbal and quantitative SCSs were highly developed; spatial and perceptual SCSs were weak. This pattern confirms the DPT assumption that the core meaning making system, driven by the SARA-C mechanism, orchestrates but cannot fully express the operation and development of domains, because domain-specific domestication is required.
- Self-representation fidelity. LLMs accurately differentiated their strengths and weaknesses across domains. Their self-concepts displayed developmental scaling, recognizing differences among representational, inferential, and principle-based demands. Each model’s introspective ratings paralleled its objective strengths and weaknesses, implying a form of computational self-monitoring—an emergent metacognitive control loop resembling the stages of reflective awareness in humans [12,26].
- Psychometric but not ontological parity. LLMs may recognize their top reasoning and problem-solving performance, but this is not lifted to an existential cognitive self that is itself the agent of its own change along self-selected directions. Hence, their conception of Descartes’s Cogito is computational rather than self-cognizant, and their self-ascription of AGI is modest.
Task-Specific Instantiations
Domain Asymmetries and Developmental Hierarchy
| Domain | Representational basis | Effective SARA-C level | Performance |
|---|---|---|---|
| Mathematics / logic / causality | Structured symbolic data | Level 8 (Truth-control) | Very high |
| Linguistic / verbal | Textual discourse | Level 7–8 (Inferential → Truth-control) | Excellent |
| Social / moral | Textual norms without affective grounding | Level 7 (Rule-based) | Moderate |
| Visual–spatial / imagination | Sparse symbolic proxies | Level 6 (Representational) | Low |
- Integrating perceptual grounding. LLMs’ main limitation—the lack of visual and spatial imagination—echoes early representational deficits in human development before the consolidation of perceptual awareness. Bridging this gap requires multi-modal architectures that fuse symbolic prediction with sensorimotor simulation. The development of embodied multimodal agents would operationalize the full SARA-C cycle by enabling genuine relate and abstract operations across sensory modalities.
- Implementing explicit cognizance loops. The structural alignment between self-concept and performance indicates a nascent form of meta-representation. Embedding explicit self-monitoring layers—internal “metacognitive controllers” tracking uncertainty and inference reliability—would bring artificial systems closer to the cognize operation of SARA-C. Such mechanisms could underpin self-correction, reflective reasoning, and moral calibration.
- Developmental engineering of general intelligence. In humans, developmental progress reflects the dynamic integration of SCSs—categorical, quantitative, causal, spatial, and social—under an increasingly abstract control core. The same principle can guide the design of developmentally engineered AI: systems that progressively integrate domain-specific processors under shared control hierarchies. Simulating this developmental layering may yield genuinely general intelligence rather than domain-specific competence.
- Moral and epistemic maturation. The finding that LLMs often favored socially utilitarian over principle-based moral reasoning suggests that current models approximate the conventional moral stage in human development (akin to SARA-C Level 7). Embedding principle- and truth-control algorithms—representing fairness, consistency, and epistemic humility—could move AI reasoning toward Level 8–9 epistemic maturity, reducing bias and promoting value-sensitive alignment [28].
- LLMs as developmental laboratories. Because LLMs reproduce human developmental hierarchies in compressed form, they offer unprecedented experimental leverage for testing cognitive-developmental theories. Variations in architecture, data modality, and feedback structure can be used to emulate developmental transitions predicted by DPT—allowing direct computational exploration of how relational integration and cognizance evolve across species and systems [29].
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Gignac, G. E., & Ilić, D. (2025). Psychometrically derived 60-question benchmarks: Substantial efficiencies and the possibility of human-AI comparisons. Intelligence, 110, 101922. [CrossRef]
- Ilić, D., & Gignac, G. E. (2024). Evidence of interrelated cognitive-like capabilities in large language models: Indications of artificial general intelligence or achievement? Intelligence, 106, 101858. [CrossRef]
- Huang, J., & Li, O. (2024). Measuring the IQ of mainstream large language models in Chinese using the Wechsler adult intelligence scale. Authorea Preprints.
- Wasilewski, E., & Jablonski, M. (2024). Measuring the perceived IQ of multimodal large language models using standardized IQ tests. Authorea Preprints.
- Inhelder, B., & Piaget, J. (1958). The growth of logical thinking from childhood to adolescence: An essay on the construction of formal operational structures. Psychology Press.
- Chuderski, A. (2014). The relational integration task explains fluid reasoning above and beyond other working memory tasks. Memory & Cognition, 42(3), 448-463. [CrossRef]
- Demetriou A, & Efklides, A. (1989). The person’s conception of the structures of developing intellect: Early adolescence to middle age. Genetic, Social, and General Psychology Monographs, 115, 371-423.
- Demetriou, A., Efklides, A., & Platsidou, M. (1993). The architecture and dynamics of developing mind: Experiential structuralism as a frame for unifying cognitive developmental theories. Monographs of the Society for Research in Child Development, 58(5–6), v–167. [CrossRef]
- Demetriou, A., Kazali, E., Spanoudis, G., Makris, N., & Kazi, S. (2024). Executive function: Debunking an overprized construct. Developmental Review, 74, 101168. [CrossRef]
- Halford, G. S., Wilson, W. H., & Phillips, S. (1998). Processing capacity defined by relational complexity: Implications for comparative, developmental, and cognitive psychology. Behavioral and Brain Sciences, 21(6), 803-831. [CrossRef]
- Demetriou, A., Makris, N., Spanoudis, G., Karousou, A., Kazi, S., Oikonomakou, D., Bikos, T. (2025). How Intelligence Changes with Development: A Theory of General Intelligence and Cognitive Development. Submitted.
- Demetriou, A., Makris, N., Kazi, S., Spanoudis, G., & Shayer, M. (2018). The developmental trinity of mind: Cognizance, executive control, and reasoning. WIREs Cognitive Science, 2018;e1461. [CrossRef]
- Demetriou, A., Mouyi, A., Spanoudis, G., & Makris, N. (2022). Changing developmental priorities between executive functions, working memory, and reasoning in the formation of g from 6 to 12 years. Intelligence, 90, 101602. [CrossRef]
- Demetriou, A., Spanoudis, G., & Papadopoulos, T. (2024). The typical and atypical developing mind: a common model. Development and Psychopathology, 36, 1–13. [CrossRef]
- Kazali, E., Spanoudis, G., & Demetriou, A. (2024). g: Formative, reflective, or both? Intelligence, 107, 101870. [CrossRef]
- Demetriou, A., Savva, A., & Spanoudis, G. (2025). SARA-C: A core mechanism underlying g in evolution and development. Behavioral and Brain Sciences. [CrossRef]
- Demetriou, A., Kazi, S., & Georgiou, S. (1999). The emerging self: The convergence of mind, self, and thinking styles. Developmental Science, 2:4, 387-409. [CrossRef]
- Demetriou, A. (in press). Becoming Wise: A Developmental Control Model of Wisdom. In J. Stevens Long & E., Kallio (Eds.), The handbook of adult wisdom. Oxford, UK: Oxford University Press.
- Bilalić, M., McLeod, P., & Gobet, F. (2008). Why good thoughts block better ones: The mechanism of the pernicious Einstellung (set) effect, Cognition, 108 (3), 652-661. [CrossRef]
- Carroll, J. B. (1993). Human cognitive abilities: A survey of factor-analytic studies; Cambridge University Press. [CrossRef]
- Haier, R. J., Colom, R., & Hunt, E. (2023). The science of human intelligence. Cambridge University Press.
- Kovacs, K., & Conway, A. R. (2016). Process overlap theory: A unified account of the general factor of intelligence. Psychological Inquiry, 27(3), 151-177. [CrossRef]
- van der Maas, H. L. J., Kan, K.-J., Marsman, M., & Stevenson, C. E. (2017). Network models for cognitive development and intelligence. Journal of Intelligence, 5, 1–17. [CrossRef]
- OpenAI. (2023). ChatGPT (Mar 14 Version) [Large language model]. https://chat.openai.com.
- Bostrom, N. (2014). Superintelligence: Paths, dangers, strategies. Oxford University Press.
- Friston, K. (2010). The free-energy principle: A unified brain theory. Nature Reviews Neuroscience, 11(2), 127–138. [CrossRef]
- Passingham, R. E., & Wise, S. P. (2012). The neurobiology of the prefrontal cortex: Anatomy, evolution, and the origin of insight. Oxford University Press.
- Jablonka, E., & Ginsburg, S. (2022). Learning and the evolution of conscious agents. Biosemiotics, 15(3), 401–437. [CrossRef]
- Woodley of Menie, M. A., & Peñaherrera-Aguirre, M. (2023). Convergence between G and g in three monkey species. Journal of Comparative Psychology, 137, 62–73. [CrossRef]
- Hendrycks, Dan & Song, Dawn & Szegedy, Christian & Lee, Honglak & Gal, Yarin & Brynjolfsson, Erik & Li, Sharon & Zou, Andy & Levine, Lionel & Han, Bo & Fu, Jie & Liu, Ziwei & Shin, Jinwoo & Lee, Kimin & Mazeika, Mantas & Phan, Long & Ingebretsen, George & Khoja, Adam & Xie, Cihang & Bengio, Yoshua. (2025). A Definition of AGI. 10.48550/arXiv.2510.18212.




| Level | Phylogenetic Expression of SARA-C | Ontogenetic Expression (LoT) | Core Control / Awareness |
|---|---|---|---|
| 1-3. Reflexive and associative levels | Reflexive loops (Annelids: stimulus-bound responses); associative modular learning (Arthropods: cue–reward mapping); distributed contextual control (Cephalopods: flexible motor routines) | Proto-LoT: species- or context-specific repertoires; minimal compositionality and relation encoding | Reflexive detection, domain-specific integration, no meta-representation |
| 4. Episodic | Hierarchical predictive control in early vertebrates (e.g., object manipulation, foraging sequences) | Episodic LoT: action–object sequences, implicit expectations, statistical inference | Action control, implicit awareness |
| 5. Representational | Recursive symbolic play in higher primates; symbolic communication | Representaional LoT: preschool symbolic sequences, attention control, representational cognizance | Symbol mastery, representational awareness |
| 6. Inferential | Multi-domain relational integration (primates, early humans) | Rule-based LoT: rule use, biconditional reasoning, multidimensional structures | Rule-based inference, process control |
| 7. Truth-based | Fully recursive reasoning and self-monitoring (modern humans, adolescence) | Principle-based LoT: truth/consistency control, deductive reasoning, algebraic generalization | Principles constraining inference, truth evaluation |
| 8. Epistemic | Cultural-symbolic systems (science, law, ethics) embedding truth standards | Epistemic LoT: epistemological awareness, asymmetry of evidence, plural interpretive frameworks | Epistemic evaluation of propositions and justifications |
| ChatGPT | Gemini | Grok | DeepSeek | 5 years | 8 years | |
|---|---|---|---|---|---|---|
| 1D | ||||||
| Raw | 25 | unsolvable | 25 | unsolvable | 46 | 87 |
| Screenshot | 100 | 100 | 75 | 100 | -- | -- |
| 2D | ||||||
| Raw | 50 | unsolvable | 25 | unsolvable | 37 | 94 |
| Screenshot | 100 | 100 | 67 | 67 | -- | -- |
| Undecidable | ||||||
| Raw | 0 | unsolvable | 0 | unsolvable | 28 | 82 |
| Screenshot | 100 | 100* | 100* | 100* | -- | -- |
| Chat-GPT 5.0 | Gemini | Grok 4 | DeepSeek | 3rd gym 15-yr-old |
3rd lyc 18-yr-old |
University students 22-yr-old |
|
|---|---|---|---|---|---|---|---|
| Raven Matrices | 83 | 83 | 67 | 33 | 52 | 72 | 90 |
| Visual/Spatial | 75 | 67 | 42 | 25 | 58 | 62 | 74 |
| Verbal SCS | |||||||
| Verbal analogies | 83 | 67 | 67 | 67 | 29 | 33 | 48 |
| Class reasoning | 100 | 100 | 67 | 83 | 44 | 62 | 82 |
| Prop reasoning | 67 | 67 | 67 | 50 | 31 | 56 | 64 |
| Prag reasoning | 100 | 100 | 67 | 67 | 54 | 54 | 69 |
| Causal SCS | |||||||
| Causal relation | 100 | 100 | 67 | 33 | 40 | 42 | 58 |
| Hypoth. testing | 100 | 100 | 100 | 100 | 49 | 56 | 71 |
| Isol. of variab. | 100 | 100 | 100 | 67 | 59 | 69 | 85 |
| Epist. aware | 100 | 100 | 100 | 100 | 49 | 55 | 64 |
| Quantitative SCS | |||||||
| Number series | 67 | 67 | 100 | 50 | 34 | 43 | 58 |
| Numerical anal | 100 | 100 | 100 | 100 | 59 | 72 | 83 |
| Algebraic reas. | 100 | 100 | 100 | 100 | 69 | 82 | 93 |
| Social/moral | 90 | 80 | 60 | 50 | 16 | 33 | 36 |
| 90.4 | 87.9 | 78.9 | 66.1 | 45.9 | 56.5 | 69.6 |
| Phase | Dominant SARA-C operations | Human analogue (DPT level) | Key developmental target for AGI |
|---|---|---|---|
| 1. Perceptual grounding | Search + Align | Representational (6) | Connect symbols to multimodal sensory prediction. |
| 2. Relational integration | Relate + Abstract | Inferential (7) | Learn rules across modalities; generalize beyond training. |
| 3. Principle synthesis | Abstract + Cognize | Truth-control (8) | Form domain-general principles and detect inconsistencies. |
| 4. Reflective self-regulation | Cognize (recursive) | Epistemic (9) | Build a persistent self-model guiding reasoning, ethics, and exploration. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).