Limits of Self-Correction in LLMs: An Information-Theoretic Analysis of Correlated Errors

Andrew Michael Brilliant

doi:10.20944/preprints202601.0892.v2

Submitted:

10 February 2026

Posted:

11 February 2026

You are already at the latest version

Abstract

Recent empirical work shows that large language models struggle to self-correct reasoning without external feedback. We propose a possible explanation: correlated error between generator and evaluator. When both components share failure modes, self-evaluation may provide weak evidence of correctness, and repeated self-critique may amplify confidence without adding information. We formalize this with two information-theoretic bounds. We then describe a practical architecture pairing high-entropy proposal generation with low-entropy external selection. This suggests an alternative to extended chain-of-thought in a single context: separate generation from evaluation using fresh context, restoring the external feedback loop that human reasoning relies on. Importantly, this can be implemented with the same model, reducing error correlation without requiring additional computational cost. The architecture does not replace human judgment; it provides a filter that surfaces candidates surviving external scrutiny for human review.

Keywords:

LLM

;

self-correction

;

information theory

;

error correlation

;

external selection

;

multi-agent verification

;

context separation

;

language models

;

reasoning

;

validation

Subject:

Computer Science and Mathematics - Artificial Intelligence and Machine Learning

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Limits of Self-Correction in LLMs: An Information-Theoretic Analysis of Correlated Errors

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe