Requirements are predominantly written in Natural Language (NL), which makes them accessible to stakeholders with varying degrees of experience, as compared to a model-based language which requires special training. However, despite its benefits, NL can introduce ambiguities and inconsistencies in requirements, which may eventually result in system quality degradation and system failure altogether. The system complexity that characterizes current systems warrants an integrated and comprehensive approach to system design and development. This need has brought about a paradigm shift towards Model-Based Systems Engineering (MBSE) approaches to system design and a departure from traditional document-centric methods. While, MBSE shows great promise, the ambiguities and inconsistencies present in NL requirements hinder their conversion to models directly. The field of Natural Language Processing (NLP) has demonstrated great potential in facilitating the conversion of NL requirements into a semi-machine-readable format that enables their standardization and use in a model-based environment. A first step towards standardizing requirements consists of classifying them according to the ``type'' (design, functional, performance, etc.) they represent. To that end, a language model capable of classifying requirements needs to be fine-tuned on labeled aerospace requirements. This paper presents the development of an annotated aerospace requirements corpus and its use to fine-tune BERT$_\text{BASE-UNCASED}$ to obtain aeroBERT-Classifier: a new language model for classifying aerospace requirements into design, functional, or performance requirements. A comparison between aeroBERT-Classifier and bart-large-mnli (zero-shot text classification) showcased the superior performance of aeroBERT-Classifier on classifying aerospace requirements despite being fine-tuned on a small labeled dataset.
Keywords
Requirements Engineering; Natural Language Processing; NLP; BERT; Requirements Classification; Text Classification
Subject
Computer Science and Mathematics, Computer Science
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.