Prompt Architecture as a High-Impact Design Factor in Expert-Rated Clinical Documentation Quality: A Controlled Comparative Study in Inpatient Rehabilitation

Idoia Eceizabarrena-Matxinandiarena; Emilio-Javier Frutos-Reoyo; José Ignacio Guerrero-Rojas; Clara Vidal-Millet; Pedro Ignacio Tejada Ezquerro; Elena Roldan-Arcelus; Irene De-Torres; Judith Sanchez-Raya; Lourdes Gil-Fraguas; María Hernandez-Manada; Carolina de Miguel-Benadiba; Josep Maria Monguet-Fierro; Alejandro Trejo-Omeñaca; Michelle Catta-Preta; Astrid Teixeira-Taborda; Natalia Álvarez-Bandrés; Raquel Cutillas-Ruiz; Helena Bascuñana-Ambrós

doi:10.20944/preprints202604.0054.v1

Submitted:

31 March 2026

Posted:

01 April 2026

You are already at the latest version

Abstract

Large language models (LLMs) are increasingly explored for clinical documentation support, yet the influence of prompting architecture on documentation quality in complex longitudinal contexts remains poorly characterized. This controlled retrospective methodological study evaluated three prompting strategies—Single Prompt (SP), Section-Based Prompt (SBP), and Section-Based Prompt with Writing Refinement (SBP+W)—for generating inpatient rehabilitation discharge reports using OpenAI large language model (GPT-5.2). Twenty anonymized rehabilitation cases involving prolonged hospital stays and multidimensional func-tional documentation were processed under standardized model conditions. AI-generated reports were compared with human-authored summaries. Two blinded board-certified rehabilitation physicians in-dependently evaluated outputs using a structured 4-point ordinal scale assessing structural integrity, clinical coherence, completeness, and readability. Inter-rater reliability was estimated with quadratic weighted Cohen’s kappa and bootstrap confidence intervals. Group differences were analyzed using non-parametric testing and exploratory multivariable modeling. All LLM prompting strategies achieved significantly higher expert-rated quality scores than hu-man-authored reports (p < 0.01). SBP demonstrated the highest median performance and strongest regression effect, although differences among LLM-based strategies were not statistically significant after correction. Prompting strategy explained more variability in expert ratings than case-level factors. Structured section-based prompting may represent a practical design lever for improving perceived quality in AI-assisted clinical documentation workflows. Keywords: artificial intelligence; clinical documentation; discharge reports; large language models; medical writing; prompt architecture; prompt engineering; rehabilitation medicine.

Keywords:

artificial intelligence

;

clinical documentation

;

discharge reports

;

large language models

;

medical writing

;

prompt architecture

;

prompt engineering

;

rehabilitation medicine

Subject:

Medicine and Pharmacology - Other

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Prompt Architecture as a High-Impact Design Factor in Expert-Rated Clinical Documentation Quality: A Controlled Comparative Study in Inpatient Rehabilitation

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe