Knowing Before Speaking: In-Computation Metacognition Precedes Verbal Confidence in Large Language Models

Jaehwan Kim

doi:10.20944/preprints202604.0078.v2

Submitted:

01 April 2026

Posted:

03 April 2026

You are already at the latest version

Abstract

We propose the Knowledge Landscape hypothesis: a large language model’s forward pass encodes whether it knows the answer before producing any output token. Well-learned knowledge traverses deep convergence valleys in the activation landscape; unlearned queries traverse flat plains where signals disperse. These geometric properties manifest as two probe-free, single-pass signals—token-level entropy and layer-wise hidden-state variance— that precede and causally influence output uncertainty. Across two architecturally distinct models (Qwen2.5-7B and Mistral-7B) on TriviaQA, token entropy strongly discriminates known from unknown questions with large effect sizes, replicated at 300 samples per condition with a 95% bootstrap confidence interval entirely above 0.64. Hidden-state variance further localises a metacognitive locus in both architectures, consistently at 61–69% of total network depth, suggesting this is a universal structural property of transformer LLMs. Activation patching confirms causality: injecting a known- question hidden state into an unknown-question forward pass monotonically reduces output entropy. A lightweight abstention system built on these signals achieves a ROC-AUC of 0.804 and a 5.6 percentage-point accuracy gain over the unaided baseline, without any fine-tuning or additional training data.

Keywords:

large language models

;

metacognition

;

uncertainty quantification

;

hallucination reduction

;

knowledge representation

;

activation patching

;

topological analysis

Subject:

Computer Science and Mathematics - Artificial Intelligence and Machine Learning

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Knowing Before Speaking: In-Computation Metacognition Precedes Verbal Confidence in Large Language Models

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe