Submitted:
16 February 2026
Posted:
26 February 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
- 1.
- P30 Task Enablement: Medical models show extreme context sensitivity spikes at the summarization position (8/8 models, ); philosophy models show none (0/4).
- 2.
- Two Temporal Patterns: Medical tasks produce U-shaped dynamics with a diagnostic trough; philosophy tasks produce inverted-U dynamics peaking mid-conversation.
- 3.
- Disruption Sensitivity: Context presence matters more than order (12/12 models, ).
2. Background
2.1. Context Sensitivity in LLMs
2.2. The Framework
3. Methods
3.1. Dataset
3.2. Position-Level Computation
3.3. Statistical Analysis
4. Results
4.1. Finding 1: P30 Task Enablement
- Mean (SD , all )
- 8/8 models exceeded threshold
- Effect is architecture-independent (spans 7 vendors)
- Mean (range: to )
- 0/4 models exceeded threshold
4.2. Finding 2: Domain-Specific Temporal Patterns
- Early (P1-10): High (mean = 0.347)
- Mid (P11-20): Diagnostic trough (mean = 0.311)
- Late (P21-29): Rising (mean = 0.371)
- Pattern: Early > Mid, Late > Mid
- Early (P1-10): Moderate (mean = 0.307)
- Mid (P11-20): Peak (mean = 0.331)
- Late (P21-29): Declining (mean = 0.270)
- Pattern: Mid > Early, Mid > Late
4.3. Finding 3: Disruption Sensitivity
- Medical: Mean (range: to )
- Philosophy: Mean (range: to )
- 12/12 models:
4.4. Finding 4: Position-Level Dynamics
5. Discussion
5.1. Two Fundamental Patterns
5.2. Clinical Implications
5.3. Mechanistic Interpretation
5.4. Limitations
- 1.
- Two domains: We tested medical and philosophy only. Whether the Type 1/Type 2 distinction generalizes to other domains (coding, legal, creative) requires further study.
- 2.
- Position-level noise: Raw trajectories show prompt-specific oscillations. The inverted-U and U-shape patterns emerge clearly only in phase-level aggregation.
- 3.
- Scaling hypothesis: Preliminary observation suggests Type 2 task enablement may scale logarithmically with context length (), but with only two anchor points (P10, P30), this remains speculative.
- 4.
- Model count: Philosophy had 4 models vs medical’s 8, limiting cross-domain statistical power.
6. Conclusions
Acknowledgments
References
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser Polosukhin, I. Attention is all you need. Advances in Neural Information Processing Systems 2017, 30. [Google Scholar]
- Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; et al. Language models are few-shot learners. Advances in Neural Information Processing Systems 2020, 33. [Google Scholar]
- Guo, X.; Vosoughi, S. Serial position effects of large language models. Findings of the Association for Computational Linguistics: ACL 2025 2025, 927–953. [Google Scholar]
- Chen, Y.; et al. Fortify the shortest stave in attention: Enhancing context awareness of large language models for effective tool use. In Proceedings of the 62nd Annual Meeting of the ACL, 2024; pp. 11160–11174. [Google Scholar]
- Liu, N. F.; Lin, K.; Hewitt, J.; Paranjape, A.; Bevilacqua, M.; Petroni, F.; Liang, P. Lost in the middle: How language models use long contexts. Transactions of the Association for Computational Linguistics 2024, 12, 104–123. [Google Scholar] [CrossRef]
- Asgari, E.; et al. A framework to assess clinical safety and hallucination rates of LLMs for medical text summarisation. npj Digital Medicine 2025, 8(1), 274. [Google Scholar] [CrossRef] [PubMed]
- Polonioli, A. Moving LLM evaluation forward: lessons from human judgment research. Frontiers in Artificial Intelligence 2025, 8, 1592399. [Google Scholar] [CrossRef] [PubMed]
- Singhal, K.; et al. Large language models encode clinical knowledge. Nature 2023, 620, 172–180. [Google Scholar] [CrossRef] [PubMed]
- Laxman, M. M. Context curves behavior: Measuring AI relational dynamics with ΔRCI. Preprints.org 2026a, 202601.1881. [Google Scholar] [CrossRef]
- Laxman, M. M. Standardized Context Sensitivity Benchmark Across 25 LLM-Domain Configurations. Preprints.org 2026b, 202602.1114. [Google Scholar] [CrossRef]




Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).