We compared four Large Language Models (LLMs; ChatGPT, Grok, Gemini, DeepSeek) with humans on tests of cognitive development, assessing relational integration, linguistic awareness, general and domain-specific reasoning, and cognitive self-awareness to specify how LLMs compare with humans along cognitive development hierarchies. LLMs also discussed how Descartes’s Cogito applies to them and rated themselves on aspects of Artificial General Intelligence (AGI). Hence, we propose a novel interdisciplinary comparison between human and LLM capabilities integrating developmental, cognitive, and psychometric psychology. Overall, the structure of processes in humans and LLMs was highly similar. All LLMs attained perfect linguistic and metalinguistic performance. ChatGPT and Gemini outperformed university students in mathematics and causal reasoning. Grok performed slightly and DeepSeek considerably lower. All LLMs underperformed in visual–spatial tasks. Self-evaluation profiles broadly mirrored performance profiles: ChatGPT and Grok rated themselves high in reasoning and low in visualization, Gemini inflated visualization by reframing it as linguistic creativity, and DeepSeek consistently underrated itself. Each LLM restated Descartes’s Cogito differently, reflecting its own priorities, and denied having high AGI. Therefore, LLMs display human-like “subjective” task scaling implying algorithmic or functional metacognition, capturing their architectural gap between symbolic reasoning and visuospatial cognition, but they were modest in claiming top human intelligence. Implications for an integrated natural-artificial intelligence theory are discussed. Also, a developmental engineering model is sketched that might allow removing limitations of each LLM.