Hallucination Detection and Reduction in Open-Source Large Language Models via the Kerimov–Alekberli Information-Geometric Framework: Empirical Evaluation on HaluEval, FEVER, and SimpleQA

Rahid Zahid Alekberli; Hikmat Karimov

doi:10.20944/preprints202605.0895.v1

Submitted:

12 May 2026

Posted:

13 May 2026

You are already at the latest version

Abstract

Background: Hallucination the generation of factually incorrect, internally in consistent, or ungrounded content remains a critical barrier to safe LLM deployment in high-stakes domains. Existing detection methods typically require external knowledge bases, model ne-tuning, or cloud API access, limiting applicability in local inference contexts. Methods: We evaluate the Kerimov–Alekberli (K–A) information-geometric framework as a real-time, inference-time hallucination detector across six open-source LLMs deployed locally on Apple M5 Silicon via Ollama v0.23.2 (Q4_K_M quantisation). The K–A framework monitors the KL divergence between consecutive output distributions relative to a Fisher Information Metric (FIM)-derived threshold (τ = 0.065), triggering First-Passage Time (FPT) alarms when generation departs from the stable Riemannian output manifold. We evaluate 120 responses (6 models × 20 questions) drawn from three established benchmarks: HaluEval (14 questions; categories: Fact, Confuse, Date, Num, Trap), FEVER (4 questions; adversarial fact verification), and SimpleQA (2 questions; precise factual recall). All questions are classified as difficulty level Hard, targeting known LLM failure modes including o-by-one numerical errors, geographical traps, and disputed-attribution confounds. Results: The K–A framework achieves a session hallucination detection rate of 90.9% (20/22 hallucinated responses correctly flagged) with zero false positives on correct responses (0/98). Model-level hallucination rates vary dramatically: deepseekr1:latest (Qwen3 CoT architecture, 5.2GB) exhibits a 95% hallucination rate (19/20 questions) with 100% K–A detection; gemma3:27b (Gemma3, 17.4GB) and gemma3:latest (4.3B, 3.3GB) achieve 0% hallucination. Two K–A false negatives involve con dent factual errors below the KL threshold. Average KL divergence for hallucinated responses (D̅_KL = 0.068 ± 0.004) is significantly higher than for correct responses (D̅_KL = 0.042 ± 0.016). Conclusions: K–A achieves competitive hallucination detection without external knowledge bases, ne-tuning, or cloud infrastructure, processing each response in real time with negligible overhead. The deepseek-r1 result reveals a fundamental tension between chain-of thought reasoning depth and factual precision on concise queries that warrants systematic investigation.

Keywords:

hallucination detection

;

large language models

;

KL divergence

;

Fisher information metric

;

HaluEval

;

FEVER

;

SimpleQA

;

AI safety

;

information geometry

;

local inference

;

first-passage time

;

factual accuracy

Subject:

Computer Science and Mathematics - Artificial Intelligence and Machine Learning

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Hallucination Detection and Reduction in Open-Source Large Language Models via the Kerimov–Alekberli Information-Geometric Framework: Empirical Evaluation on HaluEval, FEVER, and SimpleQA

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe