Submitted:
29 June 2026
Posted:
30 June 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
- To the best of our knowledge, we take the first step toward a training-free hallucination detection method specifically designed for D-LLMs, which mitigates the data dependency and generalizability limitations of existing training-based detectors.
- We propose TRE, a training-free and single-run metric for hallucination detection of D-LLMs, with designs grounded in empirical analysis. The proposed metric TRE enjoys simplicity, generality, and accessibility, without requiring data-driven training or multi-run sampling.
- Extensive experiments on multiple datasets and backbone D-LLMs demonstrate that TRE achieves strong effectiveness, computational efficiency, robustness, and generalizability across diverse scenarios, highlighting its potential for practical deployment.
2. Related Work
3. Preliminaries
4. TRE for D-LLM Hallucination Detection
4.1. Revelation State-Based Evidence Construction
4.2. Temporally Weighted Evidence Aggregation
TRE Metric Design.
4.3. Understanding TRE from a Statistical Physics Perspective
5. Experiments
5.1. Experimental Setup
5.2. Results and Analysis
6. Conclusions
Appendix A. Details of Motivating Experiments
Appendix A.1. Evidence Construction Experiments
Token-Family Evidence.
Cohen’s d.
Analysis.

Appendix A.2. Revealing-Token Entropy Trajectory
Analysis.

Appendix B. Detailed Analogy of CDS and D-LLM
Appendix B.1. Discrete Diffusion, Reveal Operators, and Effective Energy
Appendix B.2. Selection-Induced Residual Concentration and Boundary Entropy Flux
Appendix B.4.4.6. Reveal-Boundary Flux Identity.
Why This Flux is Diagnostic.
Appendix B.3. Pointwise Entropy Amplification Under Discrete Reveal
Appendix B.4. Committed Tokens as Attention-Induced Boundary Conditions
Appendix B.5. Boundary Susceptibility and Unstable Scaffolds
Proposition 1.
Proof.
Appendix B.6. From Boundary Entropy Flux to TRE
Appendix B.7. Summary of the Formal Picture
Appendix C. Detailed Experimental Setup
Appendix C.1. Datasets and Benchmarks
- TriviaQA [46] is an open-domain question-answering dataset centered on factual knowledge. Since many questions require identifying specific entities, dates, locations, or events, this benchmark is useful for evaluating whether a detector can recognize failures in knowledge-intensive generations.
- HotpotQA [45] contains questions that often require combining multiple pieces of evidence before producing the final answer. Compared with single-hop factual recall, this setting introduces additional reasoning complexity, making it suitable for testing whether hallucination detection methods can capture errors that arise during compositional or multi-step reasoning.
- CommonsenseQA [55] focuses on commonsense reasoning. The questions typically require models to rely on implicit everyday knowledge rather than simply matching surface-level facts. This benchmark therefore complements the other two datasets by testing hallucination detection in scenarios where correctness depends on plausible world knowledge and commonsense associations.
Appendix C.2. Model Backbones
Appendix C.3. Baseline Configurations
Training-based Detectors.
- CCS [30] learns a contrastive direction in the representation space by encouraging consistency between paired views of truthful and untruthful behavior. It provides a representative latent-probing baseline for detecting whether internal activations encode factual correctness.
- TSV [52] constructs a truthfulness-oriented separator vector from latent representations. The learned direction is then used to assign truthfulness scores to model outputs according to their positions in the hidden-state space.
- TraceDet [11] is a trajectory-based detector designed for diffusion language models. Instead of relying only on the final answer, it analyzes information from the denoising process and aggregates trajectory-level signals for hallucination detection. Since TraceDet is a strong recent detector for D-LLMs, we treat it as the primary training-based baseline.
Training-Free Detectors.
- Perplexity [53] uses the likelihood assigned by the model to its generated sequence as a confidence signal. Responses with lower likelihood are generally treated as less reliable.
- LN-Entropy [54] measures predictive uncertainty through length-normalized entropy. By averaging uncertainty over the generated sequence, it reduces the bias introduced by different response lengths.
- Semantic Entropy [21] estimates uncertainty from the semantic diversity of multiple sampled responses. Generations that express inconsistent meanings are assigned higher uncertainty, even if their surface forms differ only partially.
- Lexical Similarity [22] evaluates the agreement among multiple sampled outputs using surface-level textual overlap. Lower similarity across samples suggests weaker generation stability and potentially higher hallucination risk.
- EigenScore [23] derives a confidence score from the spectral properties of hidden representations. It captures variation in the internal representation space and uses this structure as an indicator of response reliability.
Appendix C.4. Implementation Details
Dataset Construction.
Response Generation.
Decoding Configuration.
Automatic Annotation.
Feature Extraction and Scoring.
Computational Environment.
Appendix D. Additional Experimental Results
Appendix D.1. Additional Ablation Study
| Variant | TriQA | HotQA | CSQA |
|---|---|---|---|
| All tokens | 74.6 | 73.8 | 63.9 |
| Unrevealed | 76.4 | 75.7 | 60.2 |
| Revealed | 62.1 | 62.9 | 63.4 |
| Average | 83.2 | 79.7 | 77.4 |
| Exponential | 83.9 | 80.4 | 77.7 |
| Hard last- | 83.5 | 80.2 | 77.7 |
| TRE (ours) | 83.9 | 80.6 | 77.7 |
Appendix D.2. Additional Case Studies

Appendix E. Limitations and Broader Impacts
References
- Savinov, N.; Chung, J.; Binkowski, M.; Elsen, E.; Oord, A.v.d. Step-unrolled denoising autoencoders for text generation. In Proceedings of the International Conference on Learning Representations, 2022.
- Li, X.; Thickstun, J.; Gulrajani, I.; Liang, P.S.; Hashimoto, T.B. Diffusion-lm improves controllable text generation. Advances in neural information processing systems 2022, 35, 4328–4343.
- Lou, A.; Meng, C.; Ermon, S. Discrete diffusion modeling by estimating the ratios of the data distribution. In Proceedings of the International Conference on Machine Learning, 2024.
- Israel, D.; Broeck, G.V.d.; Grover, A. Accelerating diffusion llms via adaptive parallel decoding. arXiv preprint arXiv:2506.00413 2025.
- Arriola, M.; Gokaslan, A.; Chiu, J.T.; Yang, Z.; Qi, Z.; Han, J.; Sahoo, S.S.; Kuleshov, V. Block diffusion: Interpolating between autoregressive and diffusion language models. In Proceedings of the International Conference on Learning Representations, 2025.
- Ji, Z.; Lee, N.; Frieske, R.; Yu, T.; Su, D.; Xu, Y.; Ishii, E.; Bang, Y.J.; Madotto, A.; Fung, P. Survey of hallucination in natural language generation. ACM computing surveys 2023, 55, 1–38.
- Huang, L.; Yu, W.; Ma, W.; Zhong, W.; Feng, Z.; Wang, H.; Chen, Q.; Peng, W.; Feng, X.; Qin, B.; et al. A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. ACM Transactions on Information Systems 2025, 43, 1–55.
- Pan, J.; Zheng, Y.; Tan, Y.; Liu, Y. A Survey of Generalization of Graph Anomaly Detection: From Transfer Learning to Foundation Models. In Proceedings of the The 16th IEEE International Conference on Knowledge Graphs, 2025.
- Duan, J.; Cheng, H.; Wang, S.; Zavalny, A.; Wang, C.; Xu, R.; Kailkhura, B.; Xu, K. Shifting attention to relevance: Towards the predictive uncertainty quantification of free-form large language models. In Proceedings of the Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024, pp. 5050–5063.
- Qian, Y.; Tan, Y.; Liu, Y.; Yu, W.; Pan, S. DynHD: Hallucination Detection for Diffusion Large Language Models via Denoising Dynamics Deviation Learning. arXiv preprint arXiv:2603.16459 2026.
- Chang, S.; Yu, J.; Wang, W.; Chen, Y.; Yu, J.; Torr, P.; Gu, J. TraceDet: Hallucination Detection from the Decoding Trace of Diffusion Large Language Models. In Proceedings of the International Conference on Learning Representations, 2026.
- Hemmat, A.; Torr, P.; Chen, Y.; Yu, J. TDGNet: Hallucination Detection in Diffusion Language Models via Temporal Dynamic Graphs. arXiv preprint arXiv:2602.08048 2026.
- Manakul, P.; Liusie, A.; Gales, M. Selfcheckgpt: Zero-resource black-box hallucination detection for generative large language models. In Proceedings of the Proceedings of the 2023 conference on empirical methods in natural language processing, 2023, pp. 9004–9017.
- Farquhar, S.; Kossen, J.; Kuhn, L.; Gal, Y. Detecting hallucinations in large language models using semantic entropy. Nature 2024, 630, 625–630.
- Sahoo, S.S.; Arriola, M.; Schiff, Y.; Gokaslan, A.; Marroquin, E.; Chiu, J.T.; Rush, A.; Kuleshov, V. Simple and effective masked diffusion language models. Advances in Neural Information Processing Systems 2024, 37, 130136–130184.
- Nie, S.; Zhu, F.; You, Z.; Zhang, X.; Ou, J.; Hu, J.; Zhou, J.; Lin, Y.; Wen, J.R.; Li, C. Large language diffusion models. arXiv preprint arXiv:2502.09992 2025.
- Zhang, Y.; Li, Y.; Cui, L.; Cai, D.; Liu, L.; Fu, T.; Huang, X.; Zhao, E.; Zhang, Y.; Chen, Y.; et al. Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models. Computational Linguistics 2025, 51, 1373–1418.
- Zhang, T.; Qiu, L.; Guo, Q.; Deng, C.; Zhang, Y.; Zhang, Z.; Zhou, C.; Wang, X.; Fu, L. Enhancing uncertainty-based hallucination detection with stronger focus. In Proceedings of the Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023, pp. 915–932.
- Chen, Q.; Li, S.; Liu, Y.; Pan, S.; Webb, G.I.; Zhang, S. Uncertainty-aware graph neural networks: A multihop evidence fusion approach. IEEE Transactions on Neural Networks and Learning Systems 2025.
- Kadavath, S.; Conerly, T.; Askell, A.; Henighan, T.; Drain, D.; Perez, E.; Schiefer, N.; Hatfield-Dodds, Z.; DasSarma, N.; Tran-Johnson, E.; et al. Language models (mostly) know what they know. arXiv preprint arXiv:2207.05221 2022.
- Kuhn, L.; Gal, Y.; Farquhar, S. Semantic uncertainty: Linguistic invariances for uncertainty estimation in natural language generation. In Proceedings of the International Conference on Learning Representations, 2023.
- Lin, Z.; Trivedi, S.; Sun, J. Generating with confidence: Uncertainty quantification for black-box large language models. Transactions on Machine Learning Research 2024.
- Chen, C.; Liu, K.; Chen, Z.; Gu, Y.; Wu, Y.; Tao, M.; Fu, Z.; Ye, J. INSIDE: LLMs’ internal states retain the power of hallucination detection. In Proceedings of the International Conference on Learning Representations, 2024.
- Kossen, J.; Han, J.; Razzak, M.; Schut, L.; Malik, S.; Gal, Y. Semantic entropy probes: Robust and cheap hallucination detection in llms. arXiv preprint arXiv:2406.15927 2024.
- Bi, X.; Chen, C.; Chen, C.; Lv, X.; Zhu, J.; Ma, H.; Zuo, E. PatchFusionMLP: A scalable multi-resolution MLP framework for time series prediction. Pattern Recognition 2026, p. 113263.
- Azaria, A.; Mitchell, T. The internal state of an LLM knows when it’s lying. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023, pp. 967–976.
- Zhang, S.; Yu, T.; Feng, Y. Truthx: Alleviating hallucinations by editing large language models in truthful space. In Proceedings of the Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024, pp. 8908–8949.
- Sriramanan, G.; Bharti, S.; Sadasivan, V.S.; Saha, S.; Kattakinda, P.; Feizi, S. Llm-check: Investigating detection of hallucinations in large language models. Advances in Neural Information Processing Systems 2024, 37, 34188–34216.
- Zhang, F.; Yu, P.; Yi, B.; Zhang, B.; Li, T.; Liu, Z. Prompt-guided internal states for hallucination detection of large language models. In Proceedings of the Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025, pp. 21806–21818.
- Burns, C.; Ye, H.; Klein, D.; Steinhardt, J. Discovering latent knowledge in language models without supervision. In Proceedings of the International Conference on Learning Representations, 2023.
- Zou, A.; Phan, L.; Chen, S.; Campbell, J.; Guo, P.; Ren, R.; Pan, A.; Yin, X.; Mazeika, M.; Dombrowski, A.K.; et al. Representation engineering: A top-down approach to ai transparency. arXiv preprint arXiv:2310.01405 2023.
- Li, J.; Cheng, X.; Zhao, X.; Nie, J.Y.; Wen, J.R. Halueval: A large-scale hallucination evaluation benchmark for large language models. In Proceedings of the Proceedings of the 2023 conference on empirical methods in natural language processing, 2023, pp. 6449–6464.
- Min, S.; Krishna, K.; Lyu, X.; Lewis, M.; Yih, W.t.; Koh, P.; Iyyer, M.; Zettlemoyer, L.; Hajishirzi, H. Factscore: Fine-grained atomic evaluation of factual precision in long form text generation. In Proceedings of the Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023, pp. 12076–12100.
- Dziri, N.; Kamalloo, E.; Milton, S.; Zaïane, O.R.; Yu, M.; Ponti, E.M.; Reddy, S. Faithdial: A faithful benchmark for information-seeking dialogue. Transactions of the Association for Computational Linguistics 2022, 10, 1473–1490.
- Guo, Z.; Tan, F. Lost in Diffusion: Uncovering Hallucination Patterns and Failure Modes in Diffusion Large Language Models. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2026, 2026.
- Varshney, N.; Mishra, S.; Baral, C. Investigating selective prediction approaches across several tasks in iid, ood, and adversarial settings. In Proceedings of the Findings of the association for computational linguistics: Acl 2022, 2022, pp. 1995–2002.
- Austin, J.; Johnson, D.D.; Ho, J.; Tarlow, D.; Van Den Berg, R. Structured denoising diffusion models in discrete state-spaces. Advances in neural information processing systems 2021, 34, 17981–17993.
- Hoogeboom, E.; Nielsen, D.; Jaini, P.; Forré, P.; Welling, M. Argmax flows and multinomial diffusion: Learning categorical distributions. Advances in neural information processing systems 2021, 34, 12454–12465.
- Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language models are few-shot learners. Advances in neural information processing systems 2020, 33, 1877–1901.
- Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I.; et al. Language models are unsupervised multitask learners. OpenAI blog 2019, 1, 9.
- Ho, J.; Jain, A.; Abbeel, P. Denoising diffusion probabilistic models. Advances in neural information processing systems 2020, 33, 6840–6851.
- Dhariwal, P.; Nichol, A. Diffusion models beat gans on image synthesis. Advances in neural information processing systems 2021, 34, 8780–8794.
- Luo, C. Understanding diffusion models: A unified perspective. arXiv preprint arXiv:2208.11970 2022.
- Ye, J.; Xie, Z.; Zheng, L.; Gao, J.; Wu, Z.; Jiang, X.; Li, Z.; Kong, L. Dream 7b: Diffusion large language models. arXiv preprint arXiv:2508.15487 2025.
- Yang, Z.; Qi, P.; Zhang, S.; Bengio, Y.; Cohen, W.; Salakhutdinov, R.; Manning, C.D. HotpotQA: A dataset for diverse, explainable multi-hop question answering. In Proceedings of the Proceedings of the 2018 conference on empirical methods in natural language processing, 2018, pp. 2369–2380.
- Joshi, M.; Choi, E.; Weld, D.S.; Zettlemoyer, L. Triviaqa: A large scale distantly supervised challenge dataset for reading comprehension. In Proceedings of the Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2017, pp. 1601–1611.
- Maynez, J.; Narayan, S.; Bohnet, B.; McDonald, R. On faithfulness and factuality in abstractive summarization. In Proceedings of the Proceedings of the 58th annual meeting of the association for computational linguistics, 2020, pp. 1906–1919.
- Khalil, H.K.; Grizzle, J.W. Nonlinear systems; Vol. 3, Prentice hall Upper Saddle River, NJ, 2002.
- Posa, M.; Kuindersma, S.; Tedrake, R. Optimization and stabilization of trajectories for constrained dynamical systems. In Proceedings of the 2016 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2016, pp. 1366–1373.
- Mézard, M.; Parisi, G.; Virasoro, M.A.; Thouless, D.J. Spin glass theory and beyond, 1988.
- Binder, K.; Young, A.P. Spin glasses: Experimental facts, theoretical concepts, and open questions. Reviews of Modern physics 1986, 58, 801.
- Park, S.; Du, X.; Yeh, M.H.; Wang, H.; Li, Y. Steer llm latents for hallucination detection. In Proceedings of the International Conference on Machine Learning, 2025.
- Ren, J.; Luo, J.; Zhao, Y.; Krishna, K.; Saleh, M.; Lakshminarayanan, B.; Liu, P.J. Out-of-distribution detection and selective generation for conditional language models. In Proceedings of the International Conference on Learning Representations, 2023.
- Malinin, A.; Gales, M. Uncertainty estimation in autoregressive structured prediction. arXiv preprint arXiv:2002.07650 2020.
- Talmor, A.; Herzig, J.; Lourie, N.; Berant, J. Commonsenseqa: A question answering challenge targeting commonsense knowledge. In Proceedings of the Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 2019, pp. 4149–4158.
- Zuo, E.; Zhong, J.; Chen, C.; Chen, C.; Ubul, K.; Lv, X. Rethinking unsupervised time series anomaly detection: Dynamic attention based on route inverse-masking. Applied Soft Computing 2025, p. 113971.
- Pan, J.; Liu, Y.; Zhou, C.; Xiong, F.; Liew, A.W.C.; Pan, S. Correcting False Alarms from Unseen: Adapting Graph Anomaly Detectors at Test Time. In Proceedings of the Proceedings of the AAAI Conference on Artificial Intelligence, 2026.
- Tan, Y.; Long, G.; Jiang, J.; Zhang, C. Influence-oriented personalized federated learning. arXiv preprint arXiv:2410.03315 2024.
- Li, Y.; Qu, H.; Chen, C.; Lv, X.; Zuo, E.; Wang, K.; Cai, X. TreeXformer: Extracting tabular feature-context information using tree-structured semantics. Information Processing & Management 2025, 62, 104291.
- Miao, R.; Liu, Y.; Wang, Y.; Shen, X.; Tan, Y.; Dai, Y.; Pan, S.; Wang, X. Blindguard: Safeguarding llm-based multi-agent systems under unknown attacks. In Proceedings of the Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics, 2026.







| Method | Designedfor D-LLMs | TriviaQA | HotpotQA | CSQA | Avg | |||
|---|---|---|---|---|---|---|---|---|
| 128 | 64 | 128 | 64 | 128 | 64 | |||
| LLaDA-8B-Instruct | ||||||||
| Training-based Methods | ||||||||
| CCS | ✗ | 57.1 | 54.2 | 57.6 | 55.8 | 50.5 | 58.5 | 55.6 |
| TSV | ✗ | 60.2 | 61.1 | 65.0 | 59.4 | 52.9 | 55.2 | 59.0 |
| TraceDet | ✓ | 73.9 | 74.1 | 66.1 | 63.7 | 77.2 | 77.1 | 72.0 |
| Training-free Methods | ||||||||
| Perplexity | ✗ | 50.4 | 47.6 | 49.3 | 51.2 | 65.6 | 65.0 | 54.9 |
| LN-Entropy | ✗ | 54.6 | 53.5 | 54.8 | 54.7 | 64.6 | 64.4 | 57.8 |
| Semantic Entropy | ✗ | 68.9 | 67.3 | 57.6 | 53.8 | 44.1 | 43.9 | 55.9 |
| Lexical Similarity | ✗ | 62.5 | 59.0 | 64.2 | 57.1 | 57.3 | 60.7 | 60.1 |
| EigenScore | ✗ | 69.2 | 66.9 | 64.7 | 59.2 | 58.5 | 60.6 | 63.2 |
| TRE (ours) | ✓ | 82.2 | 86.5 | 85.6 | 88.1 | 79.0 | 75.0 | 82.7 |
| Dream-7B-Instruct | ||||||||
| Training-based Methods | ||||||||
| CCS | ✗ | 56.9 | 50.3 | 51.7 | 58.2 | 54.2 | 53.2 | 54.1 |
| TSV | ✗ | 75.6 | 74.7 | 58.7 | 63.0 | 62.3 | 56.8 | 65.2 |
| TraceDet | ✓ | 78.1 | 86.7 | 75.1 | 76.0 | 84.7 | 84.1 | 80.8 |
| Training-free Methods | ||||||||
| Semantic Entropy | ✗ | 73.7 | 72.5 | 62.7 | 67.7 | 51.4 | 48.6 | 62.8 |
| Lexical Similarity | ✗ | 58.3 | 64.0 | 59.7 | 62.7 | 77.3 | 76.9 | 66.5 |
| EigenScore | ✗ | 66.0 | 69.1 | 62.5 | 67.0 | 76.9 | 77.5 | 69.8 |
| TRE (ours) | ✓ | 83.9 | 84.8 | 80.6 | 81.0 | 77.7 | 78.4 | 81.1 |
| Variant | TriQA | HotQA | CSQA |
|---|---|---|---|
| All tokens | 64.0 | 70.9 | 75.7 |
| Unrevealed | 63.9 | 70.8 | 75.9 |
| Revealed | 53.9 | 84.0 | 60.2 |
| Average | 81.1 | 85.1 | 79.0 |
| Exponential | 82.0 | 85.6 | 79.0 |
| Hard last- | 82.4 | 85.4 | 79.0 |
| TRE (ours) | 82.2 | 85.6 | 79.0 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).