Preprint Article Version 2 Preserved in Portico This version is not peer-reviewed

Examining the Potential of Generative Language Models for Aviation Safety Analysis: Case Study and Insights using the Aviation Safety Reporting System (ASRS)

Version 1 : Received: 4 July 2023 / Approved: 4 July 2023 / Online: 4 July 2023 (10:08:18 CEST)
Version 2 : Received: 4 July 2023 / Approved: 10 July 2023 / Online: 11 July 2023 (07:13:20 CEST)

A peer-reviewed article of this Preprint also exists.

Tikayat Ray, A.; Bhat, A.P.; White, R.T.; Nguyen, V.M.; Pinon Fischer, O.J.; Mavris, D.N. Examining the Potential of Generative Language Models for Aviation Safety Analysis: Case Study and Insights Using the Aviation Safety Reporting System (ASRS). Aerospace 2023, 10, 770. Tikayat Ray, A.; Bhat, A.P.; White, R.T.; Nguyen, V.M.; Pinon Fischer, O.J.; Mavris, D.N. Examining the Potential of Generative Language Models for Aviation Safety Analysis: Case Study and Insights Using the Aviation Safety Reporting System (ASRS). Aerospace 2023, 10, 770.

Abstract

This research investigates the potential application of generative language models, especially ChatGPT, in aviation safety analysis as a means to enhance the efficiency of safety analyses and accelerate the time it takes to process incident reports. In particular, ChatGPT was leveraged to generate incident synopses from narratives, which were subsequently compared with ground truth synopses from the Aviation Safety Reporting System (ASRS) dataset. The comparison was facilitated by using embeddings from Large Language Models (LLMs), with aeroBERT demonstrating the highest similarity due to its aerospace-specific fine-tuning. A positive correlation was observed between synopsis length and their cosine similarity. In a subsequent phase, human factor issues involved in incidents as identified by ChatGPT were compared to human factor issues identified by safety analysts. A concurrence rate of 61% was found, with ChatGPT demonstrating a cautious approach towards attributing human factor issues. Finally, the model was used to attribute incidents to relevant parties. As no dedicated ground truth column existed for this task, a manual evaluation was conducted. ChatGPT attributed the majority of incidents to the Flight Crew, ATC, Ground Personnel, and Maintenance. This study opens new avenues for leveraging AI in aviation safety analysis.

Keywords

Aviation Safety Reporting System; ASRS; Aviation Safety; Human Factors; Large Language Models; LLM; ChatGPT; Generative Language Models; GPT-3.5; aeroBERT; BERT; InstructGPT; Prompt Engineering

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (1)

Comment 1
Received: 11 July 2023
Commenter: Archana Tikayat Ray
Commenter's Conflict of Interests: Author
Comment: Some typos were corrected.
+ Respond to this comment

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 1
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.