Submitted:
12 April 2025
Posted:
15 April 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
- We introduce PSACI, a novel prompt-based approach for document-level event causality identification that effectively leverages the capabilities of LLMs, moving away from complex task-specific architectures and towards a more streamlined and adaptable framework.
- We demonstrate the effectiveness of prompt engineering in guiding LLMs to implicitly capture structural information and perform causal reasoning, achieving competitive performance on benchmark datasets, especially in challenging cross-sentence scenarios.
- We explore the potential of incorporating visual context within the PSACI framework, opening up new avenues for research in multimodal event causality identification and highlighting the versatility of our proposed approach. Recent works such as [12,13], and [14] highlight the advancements in vision-language models and the importance of visual information and efficient representations in related tasks, suggesting promising directions for extending our approach to multimodal event causality identification.
2. Related Work
2.1. Event Causality Identification
2.2. Large Language Models
3. Method
3.1. Elaborated Prompt Design for Causal Identification
3.2. Detailed Learning Strategy via Instruction Tuning
4. Experiments
4.1. Experimental Setup
4.1.1. Datasets
4.1.2. Baselines
4.1.3. Evaluation Metrics
4.2. Main Results
4.3. Ablation Study
4.4. Human Evaluation of Rationales
4.5. Further Analysis
4.5.1. Performance Breakdown by Relation Type
4.5.2. Performance with Varying Document Length
4.5.3. Impact of Prompt Variation
5. Conclusion
References
- S. Zheng, Y. Hao, D. Lu, H. Bao, J. Xu, H. Hao, and B. Xu, “Joint entity and relation extraction based on a hybrid neural network,” Neurocomputing, vol. 257, pp. 59–66, 2017.
- Y. Zhou, X. Geng, T. Shen, J. Pei, W. Zhang, and D. Jiang, “Modeling event-pair relations in external knowledge graphs for script reasoning,” Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 2021.
- M. T. Phu and T. H. Nguyen, “Graph convolutional networks for event causality identification with rich document-level structures,” in Proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: Human language technologies, 2021, pp. 3480–3490.
- S. Lee, S. Seo, B. Oh, K.-H. Lee, D. Shin, and Y. Lee, “Cross-sentence n-ary relation extraction using entity link and discourse relation,” in Proceedings of the 29th ACM International Conference on Information & Knowledge Management, 2020, pp. 705–714.
- Q. Do, W. Lu, and D. Roth, “Joint inference for event timeline construction,” in Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012, pp. 677–687.
- R. Zhao, S. Joty, Y. Wang, and P. Jwalapuram, “Towards causal concepts for explaining language models,” 2023.
- Y. Zhou, T. Shen, X. Geng, C. Tao, J. Shen, G. Long, C. Xu, and D. Jiang, “Fine-grained distillation for long document retrieval,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 17, 2024, pp. 19 732–19 740.
- J. Devlin, M. Chang, K. Lee, and K. Toutanova, “BERT: pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), J. Burstein, C. Doran, and T. Solorio, Eds. Association for Computational Linguistics, 2019, pp. 4171–4186. [Online]. [CrossRef]
- Y. Zhou, X. Li, Q. Wang, and J. Shen, “Visual in-context learning for large vision-language models,” in Findings of the Association for Computational Linguistics, ACL 2024, Bangkok, Thailand and virtual meeting, August 11-16, 2024. Association for Computational Linguistics, 2024, pp. 15 890–15 902.
- T. Caselli and P. Vossen, “The event storyline corpus: A new benchmark for causal and temporal relation extraction,” in Proceedings of the Events and Stories in the News Workshop, 2017, pp. 77–86.
- I. Hendrickx, S. N. Kim, Z. Kozareva, P. Nakov, D. Ó. Séaghdha, S. Padó, M. Pennacchiotti, L. Romano, and S. Szpakowicz, “Semeval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals,” in Proceedings of the 5th International Workshop on Semantic Evaluation, SemEval@ACL 2010, Uppsala University, Uppsala, Sweden, July 15-16, 2010, K. Erk and C. Strapparava, Eds. The Association for Computer Linguistics, 2010, pp. 33–38. [Online]. Available: https://aclanthology.org/S10-1006/.
- Y. Zhou, L. Song, and J. Shen, “Training medical large vision-language models with abnormal-aware feedback,” arXiv preprint arXiv:2501.01377, 2025.
- Y. Zhou, J. Zhang, G. Chen, J. Shen, and Y. Cheng, “Less is more: Vision representation compression for efficient video generation with large language models,” 2024.
- Y. Zhou, Z. Rao, J. Wan, and J. Shen, “Rethinking visual dependency in long-context reasoning for large vision-language models,” arXiv preprint arXiv:2410.19732, 2024.
- C. Liu, W. Xiang, and B. Wang, “Identifying while learning for document event causality identification,” in Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2024, Bangkok, Thailand, August 11-16, 2024, L. Ku, A. Martins, and V. Srikumar, Eds. Association for Computational Linguistics, 2024, pp. 3815–3827. [Online]. [CrossRef]
- H. Man, M. Nguyen, and T. H. Nguyen, “Event causality identification via generation of important context words,” in Proceedings of the 11th Joint Conference on Lexical and Computational Semantics, *SEM@NAACL-HLT 2022, Seattle, WA, USA, July 14-15, 2022, V. Nastase, E. Pavlick, M. T. Pilehvar, J. Camacho-Collados, and A. Raganato, Eds. Association for Computational Linguistics, 2022, pp. 323–330. [Online]. [CrossRef]
- H. Wang, F. Liu, J. Zhang, D. Roth, and K. Richardson, “Event causality identification with synthetic control,” in Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024, Miami, FL, USA, November 12-16, 2024, Y. Al-Onaizan, M. Bansal, and Y. Chen, Eds. Association for Computational Linguistics, 2024, pp. 1725–1737. [Online]. Available: https://aclanthology.org/2024.emnlp-main.103.
- S. Ding, Y. Mao, Y. Cheng, T. Pang, L. Shen, and R. Qi, “ECIFF: event causality identification based on feature fusion,” in 35th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2023, Atlanta, GA, USA, November 6-8, 2023. IEEE, 2023, pp. 646–653. [Online]. [CrossRef]
- Y. Zhou and G. Long, “Style-aware contrastive learning for multi-style image captioning,” in Findings of the Association for Computational Linguistics: EACL 2023, 2023, pp. 2257–2267.
- X. Zhang, H. Yang, and E. F. Y. Young, “Attentional transfer is all you need: Technology-aware layout pattern generation,” in 58th ACM/IEEE Design Automation Conference, DAC 2021, San Francisco, CA, USA, December 5-9, 2021. IEEE, 2021, pp. 169–174. [Online]. [CrossRef]
- A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, I. Sutskever et al., “Language models are unsupervised multitask learners,” OpenAI blog, vol. 1, no. 8, p. 9, 2019.
- Z. Wang, M. Li, R. Xu, L. Zhou, J. Lei, X. Lin, S. Wang, Z. Yang, C. Zhu, D. Hoiem, S. Chang, M. Bansal, and H. Ji, “Language models with image descriptors are strong few-shot video-language learners,” in Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds., 2022.
- J. Kaplan, S. McCandlish, T. Henighan, T. B. Brown, B. Chess, R. Child, S. Gray, A. Radford, J. Wu, and D. Amodei, “Scaling laws for neural language models,” CoRR, vol. abs/2001.08361, 2020. [Online]. Available: https://arxiv.org/abs/2001.08361.
- Z. Dai, Z. Yang, Y. Yang, J. G. Carbonell, Q. V. Le, and R. Salakhutdinov, “Transformer-xl: Attentive language models beyond a fixed-length context,” in Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28- August 2, 2019, Volume 1: Long Papers, A. Korhonen, D. R. Traum, and L. Màrquez, Eds. Association for Computational Linguistics, 2019, pp. 2978–2988. [Online]. [CrossRef]
- Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “Roberta: A robustly optimized BERT pretraining approach,” CoRR, vol. abs/1907.11692, 2019. [Online]. Available: http://arxiv.org/abs/1907.11692.
- M. Shoeybi, M. Patwary, R. Puri, P. LeGresley, J. Casper, and B. Catanzaro, “Megatron-lm: Training multi-billion parameter language models using model parallelism,” CoRR, vol. abs/1909.08053, 2019. [Online]. Available: http://arxiv.org/abs/1909.08053.
- K. Clark, M. Luong, Q. V. Le, and C. D. Manning, “ELECTRA: pre-training text encoders as discriminators rather than generators,” in 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020. [Online]. Available: https://openreview.net/forum?id=r1xMH1BtvB.
- C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J. Liu, “Exploring the limits of transfer learning with a unified text-to-text transformer,” J. Mach. Learn. Res., vol. 21, pp. 140:1–140:67, 2020. [Online]. Available: https://jmlr.org/papers/v21/20-074.html.
| Model | EventStoryLine F1 | Intra F1 | Inter F1 | Causal-TimeBank F1 |
|---|---|---|---|---|
| PSACI (Ours) | 53.2 | 66.5 | 48.5 | 63.5 |
| CHEER | 51.4 | 62.6 | 48.4 | 62.3 |
| SENDIR | 51.9 | 66.2 | 48.3 | 61.2 |
| ERGO | 48.1 | 59.0 | 45.8 | 61.7 |
| GPT-3.5 (0-shot) | 22.2 | 35.5 | 16.4 | 36.9 |
| Model | P | R | F1 |
|---|---|---|---|
| PSACI (Ours) | 49.8 | 57.0 | 53.2 |
| PSACI w/o Rationale | 48.5 | 56.4 | 52.2 |
| PSACI Simple Prompt | 46.5 | 55.2 | 50.4 |
| Model | Rationale Coherence (Avg.) | Causal Relevance (Avg.) |
|---|---|---|
| PSACI (Ours) | 4.3 | 4.1 |
| CHEER | 3.8 | 3.6 |
| Model | Intra-sentence | Inter-sentence | ||||
|---|---|---|---|---|---|---|
| P | R | F1 | P | R | F1 | |
| PSACI (Ours) | 64.5 | 68.7 | 66.5 | 48.0 | 49.1 | 48.5 |
| CHEER | 61.8 | 63.5 | 62.6 | 47.9 | 49.0 | 48.4 |
| Model | Short Documents | Medium Documents | Long Documents |
|---|---|---|---|
| PSACI (Ours) | 55.1 | 52.8 | 51.9 |
| CHEER | 53.5 | 51.2 | 49.7 |
| Model | F1 |
|---|---|
| PSACI (Original Prompt) | 53.2 |
| PSACI with Elaborated Prompt | 53.8 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).