Scaling Behaviour of Faithfulness-Aware LSHT Transformers for Multi-Document summarization

Sameer Kumar Singh; Suhrid Pandey

doi:10.20944/preprints202605.1359.v1

Submitted:

19 May 2026

Posted:

20 May 2026

You are already at the latest version

Abstract

Recent work on neural scaling demonstrates consistent performance gains with increased data and model capacity, yet these improvements are typically assessed using surface-level metrics that do not capture factual reliability. In multi-document summarization (MDS), this limitation is particularly acute, as scaling has been shown to amplify hallucination and content distortion. In this paper, we investigate the empirical scaling behaviour of faithfulness-aware transformers under tightly controlled conditions, using LSHT as a fixed architectural and training baseline. Rather than proposing new scaling laws, we analyze how summarization quality, faithfulness and efficiency evolve as dataset size and model capacity are independently increased, while holding architecture, optimization, decoding and hardware constant. All experiments are conducted exclusively on the Multi-News benchmark to avoid cross-dataset confounds. Across ROUGE, coverage, repetition and faithfulness-oriented metrics, we show that lexical overlap and factual consistency follow distinct scaling dynamics. Faithfulness improves most rapidly during early data scaling (approximately 3–4% relative gain from 3k to 12k samples) but exhibits diminishing marginal returns at larger scales, whereas ROUGE continues to increase more smoothly. We further show that faithfulness is more sensitive to data diversity than to volume alone and identify practical scaling regimes that maximize faithfulness gains relative to computational cost. These results establish empirical expectations for scaling faithfulness-aware MDS systems and provide actionable guidance for reliable summarization under realistic resource constraints.

Keywords:

multi-document summarization

;

faithfulness

;

hallucination

;

transformers

;

scaling behaviour

;

empirical analysis

Subject:

Computer Science and Mathematics - Artificial Intelligence and Machine Learning

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Scaling Behaviour of Faithfulness-Aware LSHT Transformers for Multi-Document summarization

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe