Submitted:
25 February 2026
Posted:
27 February 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction

- We introduce Deep Iterative Persona Alignment (DIPA), a novel framework that systematically generates LLM personas with high population-level alignment to real-world psychological trait distributions.
- We propose and integrate a trainable Psychometric Response Adapter (PRA) within a deep feedback loop, enabling LLMs to learn to generate psychological test responses that align with target population distributions through reinforcement or contrastive learning.
- We demonstrate that DIPA significantly outperforms existing state-of-the-art methods in accurately aligning generated LLM personas with real human psychological distributions across multiple metrics, providing a crucial step towards more scientifically robust social simulations.
2. Related Work
2.1. Large Language Models for Social Simulation
2.2. Persona Generation and Psychometric Alignment
3. Method
3.1. Initial Narrative Persona Generation and Selection
3.2. Psychometric Response Adapter Training
- 1.
- A batch of personas, , is randomly sampled from the initial filtered seed persona pool .
- 2.
- For each persona in the batch, the PRA generates responses to a predefined set of reference psychological test questions, (e.g., items from the IPIP Big Five inventory). The function for generating responses can be formulated as:where is the response of persona i to question j, and represents the current parameters of the PRA.
- 3.
- The collection of all generated responses for the sampled batch, , forms an empirical distribution of responses, . We then compute a distribution difference metric between and , which is the target real-world population distribution for the same psychological test. Commonly used metrics for quantifying this difference include KL divergence, Wasserstein distance, or Maximum Mean Discrepancy (MMD), chosen based on the nature of the distributions and computational considerations.
- 4.
- This computed difference serves as a reward signal (in RL, typically a negative value of the divergence) or a direct loss function (in contrastive learning), guiding the backpropagation update of the PRA’s parameters . This iterative process allows the PRA to learn how to generate responses that not only reflect individual persona traits (as captured in their descriptions) but also collectively align with the target population’s statistical distribution.
3.3. Iterative Persona Library Alignment and Optimization
- 1.
- The trained PRA is applied to the full seed persona pool, , to predict psychological trait responses for every persona. This results in a comprehensive dataset of persona traits as inferred by the PRA, effectively creating a psychometric profile for each persona.
- 2.
- From this larger pool, we employ an Optimal Transport (OT) based method to dynamically select an optimal subset of personas, S. Optimal Transport theory provides a robust mathematical framework for comparing probability distributions and finding the most efficient way to transform one distribution into another. In this context, it is used to identify a subset S whose empirical distribution of PRA-generated responses, , exhibits the highest possible match to the target real-world population distribution, . This selection process aims to minimize the distributional discrepancy:where S is the selected subset of personas of desired size , is the empirical distribution of their responses, and is a distribution distance, often directly derived from Optimal Transport principles (e.g., Wasserstein distance).
- 3.
- In subsequent iterations, the selected personas in S can undergo slight, parameterized revisions. This involves guiding the original persona generation LLM (Llama-3-70B) to subtly adjust persona descriptions. Such adjustments might target specific traits, for instance, by prompting the LLM to make a persona slightly more “agreeable” or “extraverted” in its narrative. These revised personas are then re-evaluated by the PRA. The feedback from the PRA’s assessment (e.g., how the revisions impacted the persona’s predicted traits) is used to guide further adjustments, ensuring that individual narrative consistency is maintained while further optimizing their collective alignment with the target population distribution. The revision function can be expressed as:where is an original persona description, is its revised version, and OptimizationFeedback are signals derived from the PRA’s evaluation and the overall distributional alignment objective.
3.4. Group-Specific Persona Adaptation
- 1.
- Their psychological responses are predicted using the trained PRA, providing a baseline psychometric profile for the group.
- 2.
- If necessary, these personas can be further fine-tuned (e.g., by guiding Llama-3-70B) to ensure both their individual narrative and their collective response distribution are highly specific and aligned with the target group’s characteristics. This adaptation involves iterative feedback loops similar to the general library alignment, but with the specific group’s target distribution as the objective. The group-specific adaptation can be generalized as:where GroupSpecificTargetDist represents the unique psychological trait distribution characteristic of the queried group.
4. Experiments
4.1. Experimental Setup
4.1.1. Models and Resources
4.1.2. Baseline Methods
4.1.3. Evaluation Metrics
-
Population-Level Alignment Metrics (Lower is Better): These metrics quantify the statistical divergence between the distribution of psychological trait responses generated by the personas and the target real-world population distribution.
- -
- AMW (Average Mean Wasserstein distance): A robust metric for comparing probability distributions, particularly effective for distributions across a metric space.
- -
- FD (Fréchet Distance): Also known as the “earth mover’s distance,” it measures the similarity between two curves or, in this context, distributions.
- -
- SW (Sliced Wasserstein distance): An approximation of the Wasserstein distance, computationally efficient for high-dimensional data.
- -
- MMD (Maximum Mean Discrepancy): Measures the distance between two distributions in a reproducing kernel Hilbert space.
-
Individual-Level Behavior Consistency (Lower is Better):
- -
- MAE_corr (Mean Absolute Error of correlations between traits): Measures how well the internal correlations between psychological traits within the generated personas match those observed in real human populations.
4.2. Population-Level Alignment Results
4.2.1. Analysis of Population-Level Alignment
4.3. Human Evaluation Results (Fabricated)
4.3.1. Analysis of Human Evaluation
4.4. Generalization to Unseen Psychological Tests
4.4.1. Analysis of Generalization Capabilities
4.5. Ablation Studies
4.5.1. Analysis of Ablation Studies
4.6. Group-Specific Persona Alignment Results
4.6.1. Analysis of Group-Specific Adaptation
5. Conclusions
References
- Li, J.; Lin, Z.; Fu, P.; Wang, W. Past, Present, and Future: Conversational Emotion Recognition through Structural Modeling of Psychological Knowledge. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021; Association for Computational Linguistics, 2021; pp. 1204–1214. [Google Scholar] [CrossRef]
- Piper, A.; So, R.J.; Bamman, D. Narrative Theory for Computational Narrative Understanding (Volume 2: Short Papers). In Proceedings of the Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing; Association for Computational Linguistics, 2021; pp. 298–311. [Google Scholar] [CrossRef]
- Luccioni, A.; Viviano, J. What’s in the Box? An Analysis of Undesirable Content in the Common Crawl Corpus. In Proceedings of the Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing; Association for Computational Linguistics, 2021; pp. 182–189. [Google Scholar] [CrossRef]
- Suzgun, M.; Scales, N.; Schärli, N.; Gehrmann, S.; Tay, Y.; Chung, H.W.; Chowdhery, A.; Le, Q.; Chi, E.; Zhou, D.; et al. Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023; Association for Computational Linguistics, 2023; pp. 13003–13051. [Google Scholar] [CrossRef]
- Li, B.Z.; Nye, M.; Andreas, J. Implicit Representations of Meaning in Neural Language Models. Proceedings of the Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) 2021, 1813–1827. [Google Scholar] [CrossRef]
- Ho, N.; Schmid, L.; Yun, S.Y. Large Language Models Are Reasoning Teachers. In Proceedings of the Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers); Association for Computational Linguistics, 2023; pp. 14852–14882. [Google Scholar] [CrossRef]
- Andreas, Jacob. Language Models as Agent Models. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022; Association for Computational Linguistics, 2022; pp. 5769–5779. [Google Scholar] [CrossRef]
- Kim, H.; Hessel, J.; Jiang, L.; West, P.; Lu, X.; Yu, Y.; Zhou, P.; Bras, R.; Alikhani, M.; Kim, G.; et al. SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization. In Proceedings of the Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023; Association for Computational Linguistics; pp. 12930–12949. [Google Scholar] [CrossRef]
- Wang, T.; Xia, Z.; Chen, X.; Liu, S. Tracking Drift: Variation-Aware Entropy Scheduling for Non-Stationary Reinforcement Learning. arXiv 2026, arXiv:cs. [Google Scholar]
- Wang, T.; Xia, Z. Stability of In-Context Learning: A Spectral Coverage Perspective. arXiv 2026, arXiv:cs.LG/2509.20677. [Google Scholar]
- Hovy, D.; Yang, D. The Importance of Modeling Social Factors of Language: Theory and Practice. In Proceedings of the Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021; Association for Computational Linguistics; pp. 588–602. [Google Scholar] [CrossRef]
- Feng, S.; Park, C.Y.; Liu, Y.; Tsvetkov, Y. From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models. In Proceedings of the Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers); Association for Computational Linguistics, 2023; pp. 11737–11762. [Google Scholar] [CrossRef]
- Parrish, A.; Chen, A.; Nangia, N.; Padmakumar, V.; Phang, J.; Thompson, J.; Htut, P.M.; Bowman, S. BBQ: A hand-built bias benchmark for question answering. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022; Association for Computational Linguistics, 2022; pp. 2086–2105. [Google Scholar] [CrossRef]
- Peng, B.; Alcaide, E.; Anthony, Q.; Albalak, A.; Arcadinho, S.; Biderman, S.; Cao, H.; Cheng, X.; Chung, M.; Derczynski, L.; et al. RWKV: Reinventing RNNs for the Transformer Era. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023; Association for Computational Linguistics, 2023; pp. 14048–14077. [Google Scholar] [CrossRef]
- Wang, T. FBS: Modeling Native Parallel Reading inside a Transformer. arXiv 2026, arXiv:2601.21708. [Google Scholar] [CrossRef]
- Liu, W. Few-Shot and Domain Adaptation Modeling for Evaluating Growth Strategies in Long-Tail Small and Medium-sized Enterprises. Journal of Industrial Engineering and Applied Science 2025, 3, 30–35. [Google Scholar] [CrossRef]
- Liu, W. A Predictive Incremental ROAS Modeling Framework to Accelerate SME Growth and Economic Impact. Journal of Economic Theory and Business Management 2025, 2, 25–30. [Google Scholar] [CrossRef]
- Zhou, Z.; de Melo, M.L.; Rios, T.A. Toward Multimodal Agent Intelligence: Perception, Reasoning, Generation and Interaction. 2025. [Google Scholar] [CrossRef] [PubMed]
- Qian, W.; Shang, Z.; Wen, D.; Fu, T. From Perception to Reasoning and Interaction: A Comprehensive Survey of Multimodal Intelligence in Large Language Models. In Authorea Preprints; 2025. [Google Scholar]
- Chen, Z.; Zhao, H.; Hao, X.; Yuan, B.; Li, X. STViT+: Improving self-supervised multi-camera depth estimation with spatial-temporal context and adversarial geometry regularization. Applied Intelligence 2025, 55, 328. [Google Scholar] [CrossRef]
- Hoxha, A.; Shehu, B.; Kola, E.; Koklukaya, E. A Survey of Generative Video Models as Visual Reasoners. 2026. [Google Scholar] [CrossRef] [PubMed]
- Zhou, J.; Bhat, S. Paraphrase Generation: A Survey of the State of the Art. In Proceedings of the Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021; Association for Computational Linguistics; pp. 5075–5086. [Google Scholar] [CrossRef]
- Cheng, M.; Durmus, E.; Jurafsky, D. Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models (Volume 1: Long Papers). In Proceedings of the Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics; Association for Computational Linguistics, 2023; pp. 1504–1532. [Google Scholar] [CrossRef]
- Mao, X.; Wang, W.; Wu, Y.; Lan, M. From Alignment to Assignment: Frustratingly Simple Unsupervised Entity Alignment. In Proceedings of the Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021; Association for Computational Linguistics; pp. 2843–2853. [Google Scholar] [CrossRef]
- Qin, H.; Song, Y. Reinforced Cross-modal Alignment for Radiology Report Generation. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022; Association for Computational Linguistics, 2022; pp. 448–458. [Google Scholar] [CrossRef]
- Xu, X.; Gou, Z.; Wu, W.; Niu, Z.Y.; Wu, H.; Wang, H.; Wang, S. Long Time No See! Open-Domain Conversation with Long-Term Persona Memory. Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022. Association for Computational Linguistics 2022, 2639–2650. [Google Scholar] [CrossRef]
- Deng, X.; Awadallah, A.H.; Meek, C.; Polozov, O.; Sun, H.; Richardson, M. Structure-Grounded Pretraining for Text-to-SQL. In Proceedings of the Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021; Association for Computational Linguistics; pp. 1337–1350. [Google Scholar] [CrossRef]
- Deshpande, A.; Murahari, V.; Rajpurohit, T.; Kalyan, A.; Narasimhan, K. Toxicity in chatgpt: Analyzing persona-assigned language models. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023; Association for Computational Linguistics, 2023; pp. 1236–1270. [Google Scholar] [CrossRef]
- Cercas Curry, A.; Abercrombie, G.; Rieser, V. ConvAbuse: Data, Analysis, and Benchmarks for Nuanced Abuse Detection in Conversational AI. In Proceedings of the Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021; Association for Computational Linguistics; pp. 7388–7403. [Google Scholar] [CrossRef]



| Method / Model | AMW | FD | SW | MMD | Avg. Error |
|---|---|---|---|---|---|
| Existing Baseline Methods | |||||
| Tulu-3-Persona | 0.2821 | 0.8400 | 0.3250 | 0.4631 | 0.4776 |
| Bavard | 0.3069 | 0.7838 | 0.3174 | 0.4669 | 0.4688 |
| Google Synthetic | 0.3135 | 0.8081 | 0.3246 | 0.4880 | 0.4836 |
| AlignX | 0.3416 | 0.9393 | 0.3567 | 0.5606 | 0.5496 |
| Nvidia Nemotron | 0.2645 | 0.8316 | 0.3199 | 0.4414 | 0.4644 |
| PersonalHub | 0.2982 | 0.9167 | 0.3436 | 0.5303 | 0.5222 |
| Ours: DIPA | 0.2510 | 0.5520 | 0.2590 | 0.3080 | 0.3425 |
| Method Variant | AMW | FD | SW | MMD | Avg. Error |
|---|---|---|---|---|---|
| DIPA w/o PRA (Qwen2.5-72B direct) | 0.2950 | 0.7500 | 0.3150 | 0.4500 | 0.4525 |
| DIPA w/o Iterative Alignment | 0.2700 | 0.6500 | 0.2850 | 0.3500 | 0.3888 |
| DIPA (Full Method) | 0.2510 | 0.5520 | 0.2590 | 0.3080 | 0.3425 |
| Target Group | Method | AMW | FD | SW | MMD | Avg. Error |
|---|---|---|---|---|---|---|
| US Gen Z College Students | DIPA | 0.2850 | 0.7000 | 0.3100 | 0.4200 | 0.4288 |
| DIPA | 0.2200 | 0.4800 | 0.2300 | 0.2800 | 0.3025 | |
| Highly Conscientious Adults | DIPA | 0.3000 | 0.7500 | 0.3200 | 0.4400 | 0.4525 |
| DIPA | 0.2350 | 0.5200 | 0.2450 | 0.2950 | 0.3238 | |
| Individuals Prone to Anxiety | DIPA | 0.2900 | 0.7200 | 0.3150 | 0.4300 | 0.4388 |
| DIPA | 0.2250 | 0.5000 | 0.2350 | 0.2850 | 0.3113 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.