Submitted:
18 September 2023
Posted:
20 September 2023
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Related Work
3. Methodology
3.1. Subsection Dataset Collection and Preprocessing
3.2. Feature Selection
3.3. Conventional Machine Learning Methods
3.4. Deep Neural Network Learning Models
3.4.1. Convolutional Neural Networks
3.4.2. Long Short-Term Memory
3.4.3. Bidirectional Long Short-Term Memory
3.4.4. Gated Recurrent Unit
3.4.5. Bidirectional Gated Recurrent Unit
3.5. Transformer Architectures
3.5.1. BERT
3.5.2. BERTurk
3.5.3. DistilBERT
3.5.4. RoBERTa
3.6. Evaluation and Statistical Validation Metrics
3.6.1. Performance metrics
3.6.2. Statistical Validation Metrics
4. Experiments and Discussions
- RQ1: How successful are conventional supervised learning methods in identifying software requirements into functional and non-functional?
- RQ2: How successful are deep learning methods in identifying software requirements as functional and non-functional?
- RQ3: How successful are transfer learning models in identifying software requirements as functional and non-functional?
- RQ4: Which of the traditional supervised learning, deep learning and transfer learning methods is more successful in classifying software requirements?
4.1. Procedure Followed in Experiments
4.2. Experimental Results
4.3. Statistical Validation Results
5. Conclusions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Haque, M. A.; Rahman, M. A.; Siddik, M. S. Non-functional Requirements Classification with Feature Extraction and Machine Learning: An Empirical Study. In 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT), IEEE, Dhaka, Bangladesh, (3-5 May 2019), pp. 1-5. [CrossRef]
- Quba, G. Y.; Al Qaisi, H.; Althunibat, A.; AlZu’bi, S. Software Requirements Classification Using Machine Learning Algorithm’s, 2021 International Conference on Information Technology (ICIT), IEEE, Amman, Jordan, (14-15 July 2021), pp. 685-690.
- Limaylla-Lunarejo, M. -I.; Condori-Fernandez, N.; Luaces, M. R. Towards an Automatic Requirements Classification in a New Spanish Dataset, 2022 IEEE 30th International Requirements Engineering Conference (RE), Melbourne, Australia, (15-19 August 2022), pp. 270-271.
- Halim, F.; Siahaan, D. Detecting Non-Atomic Requirements in Software Requirements Specifications Using Classification Methods. In 2019 1st International Conference on Cybernetics and Intelligent System (ICORIS), IEEE, Bali, Indonesia, (22-23 August 2019), vol. 1, pp. 269-273.
- Li, B.; Li, Z.; Yang, Y. NFRNet: A Deep Neural Network for Automatic Classification of Non-Functional Requirements. In 2021 IEEE 29th International Requirements Engineering Conference (RE), IEEE, Notre Dame, IN, USA, (20-24 September 2021), pp. 434-435.
- Navarro-Almanza, R.; Juarez-Ramirez, R.; Licea, G. Towards Supporting Software Engineering Using Deep Learning: A Case of Software Requirements Classification, 2017 5th International Conference in Software Engineering Research and Innovation (CONISOFT), IEEE, Merida, Mexico, (25-27 October 2017), pp. 116-120.
- Bisi, M.; Keskar, K. CNN-BPSO Approach to Select Optimal Values of CNN Parameters for Software Requirements Classification. In 2020 IEEE 17th India Council International Conference (INDICON), IEEE, New Delhi, India, (10-13 December 2020), pp. 1-6.
- Li, J.; Cheng, K.; Wang, S.; Morstatter, F.; Trevino, R. P.; Tang, J.; Liu, H. Feature selection: A data perspective. ACM Computing Surveys, 2017; 50, 1–45. [Google Scholar]
- Remeseiro, B.; Bolon-Canedo, V. A Review of Feature Selection Methods in Medical Applications. Computers in Biology and Medicine, 2019, 112, 103375. [Google Scholar] [CrossRef] [PubMed]
- Shah, F. P.; Patel, V. A Review on Feature Selection and Feature Extraction for Text Classification. In 2016 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET), IEEE, Chennai, India, (23-25 March 2016), pp. 2264-2268.
- Gokulnath, C. B.; Shantharajah, S. P. An Optimized Feature Selection Based on Genetic Approach and Support Vector Machine for Heart Disease. Cluster Computing, 2019, 22, pp–14777. [Google Scholar] [CrossRef]
- Demir, M. Comparison of the Performances of Classification Algorithms Using Feature Selection Methods. Master's Thesis, Institute of Natural and Applied Sciences, Afyon Kocatepe University, 2021. [Google Scholar]
- Nasteski, V. An Overview of the Supervised Machine Learning Methods. Horizons. 2017, 4, pp–51. [Google Scholar] [CrossRef]
- Salmi, N.; Rustam, Z. Naïve Bayes Classifier Models for Predicting the Colon Cancer. IOP Conference Series: Materials Science and Engineering 2019, 546. [Google Scholar] [CrossRef]
- Surya, P. P.; Seetha, L. V.; Subbulakshmi, B. Analysis of User Emotions and Opinion Using Multinomial Naive Bayes Classifier. In 2019 3rd International Conference on Electronics, Communication and Aerospace Technology (ICECA), IEEE, Coimbatore, India, (12-14 June 2019), pp. 410-415.
- Nematallah, H.; Rajan, S.; Cretu, A. M. Logistic Model Tree for Human Activity Recognition Using Smartphone-Based Inertial Sensors. In 2019 IEEE SENSORS, IEEE, Montreal, QC, Canada, (27-30 October 2019), pp. 1-4.
- Asif, A.; Majid, M.; Anwar, S. M. Human Stress Classification Using EEG Signals in Response to Music Tracks. Computers in Biology and Medicine, 2019, 107, 182–196. [Google Scholar] [CrossRef] [PubMed]
- Sadiq, A. Intrusion Detection Using the WEKA Machine Learning Tool. Master's Thesis, Department of Electrical and Computer Engineering, University of Victoria, Canada, 2021.
- Aborisade, O.; Anwar, M. Classification for Authorship of Tweets by Comparing Logistic Regression and Naive Bayes Classifiers, In 2018 IEEE International Conference on Information Reuse and Integration (IRI), IEEE, (06-09 July 2018), Salt Lake City, UT, USA, pp. 269-276.
- Cahya, R. A.; Bachtiar, F. A.; Mahmudy, W. F. Comparison of Bagging Ensemble Combination Rules for Imbalanced Text Sentiment Analysis. Journal of Information Technology and Computer Science 2021, 6, 33–49. [Google Scholar] [CrossRef]
- Ali, A. T.; Abdullah, H. S.; Fadhil, M. N. Voice recognition system using machine learning techniques. Materials Today: Proceedings, 2021.
- Alsafy, B. M.; Aydam, Z. M.; Mutlag, W. K. Multiclass Classification Methods: A Review. International Journal of Advanced Engineering Technology and Innovative Science 2019, 5, 1–10. [Google Scholar]
- Borandag, E. Software Fault Prediction Using an RNN-Based Deep Learning Approach and Ensemble Machine Learning Techniques. Applied Sciences. 2023, 13(3), 1639. [Google Scholar] [CrossRef]
- Shiri, F. M.; Perumal, T.; Mustapha, N.; Mohamed, R. A Comprehensive Overview and Comparative Analysis on Deep Learning Models: CNN, RNN, LSTM, GRU. ArXiv 2023. [Google Scholar]
- Bisi, M.; Keskar, K. CNN-BPSO approach to Select Optimal Values of CNN Parameters for Software Requirements Classification. In 2020 IEEE 17th India Council International Conference (INDICON), IEEE, New Delhi, India, (10-13 December 2020), pp. 1-6.
- Fong, V. L. Software Requirements Classification Using Word Embeddings and Convolutional Neural Networks. Master's Thesis, Department of Computer Science, California Polytechnic State University, San Luis Obispo, 2018. [Google Scholar]
- Santhanam, S.; Shaikh, S. A Survey of Natural Language Generation Techniques with a Focus on Dialogue Systems - Past, Present and Future Directions. ArXiv 2019. [Google Scholar]
- Wei, W.; Zhao, X. Fault Text Classification of On-Board Equipment in High-Speed Railway Based on Labeled-Doc2vec and BiGRU. Journal of Rail Transport Planning & Management, 2023, 26, 100372. [Google Scholar]
- Bouschery, S. G.; Blazevic, V.; Piller, F. T. Augmenting Human Innovation Teams with Artificial Intelligence: Exploring Transformer-Based Language Models. Journal of Product Innovation Management 2023, 40, 139–153. [Google Scholar] [CrossRef]
- Lee, J.; Tang, R.; Lin, J. What Would Elsa Do? Freezing Layers During Transformer Fine-Tuning. ArXiv 2019. [Google Scholar]
- Acheampong, F. A.; Nunoo-Mensah, H.; Chen, W. Transformer Models for Text-Based Emotion Detection: A Review of Bert-Based Approaches. Artificial Intelligence Review 2021, 1–41. [Google Scholar] [CrossRef]
- Bozuyla, M.; Ozcift, A. Developing a Fake News Identification Model with Advanced Deep Language Transformers for Turkish COVID-19 Misinformation Data. Turkish Journal of Electrical Engineering and Computer Sciences 2022, 30, 908–926. [Google Scholar] [CrossRef]
- Joshy, A.; Sundar, S. Analyzing the Performance of Sentiment Analysis Using BERT, DistilBERT, and RoBERTa. 2022 IEEE International Power and Renewable Energy Conference (IPRECON), IEEE, Kollam, India, (16-18 December 2022), pp. 1-6.
- Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V. RoBERTa: A Robustly Optimized BERT Pretraining Approach. ArXiv 2019. [Google Scholar]
- Thi, H.D.; Andres, F.; Quoc, L.T.; Emoto, H.; Hayashi, M.; Katsumata, K.; Oshide, T. Deep Learning-Based Water Crystal Classification. Applied Sciences. 2022, 12(2), 825. [Google Scholar] [CrossRef]
- Ozcift, A.; Gulten, A. Classifier Ensemble Construction with Rotation Forest to Improve Medical Diagnosis Performance of Machine Learning Algorithms. Computer Methods and Programs in Biomedicine, 2011, 104, 443–451. [Google Scholar] [CrossRef] [PubMed]
- Ozhan, E. Improving the Information Extraction Process from the Web with Machine Learning Methods. Afyon Kocatepe University International Journal of Engineering Technologies and Applied Sciences 2020, 3, 52–59. [Google Scholar]








| Sample sentences | English translation of sentences | Label |
|---|---|---|
| Sistem olayları mevcut zamandan farklılıklarına göre renklendirecektir. | The system will color events according to their difference from the current time. | FR |
| Kullanıcı Canlı Döviz Takip Uygulaması üzerinden bir bankanın döviz bilgilerini anlık takip edebilecektir. | The user will be able to instantly follow the currency information of a bank through the Live Currency Tracking Application. | FR |
| Sosyal doku analizi uygulaması üzerinde eklenen kullanıcı bilgileri saklanacaktır. | User information added on the social texture analysis application will be stored. | FR |
| RemMed uygulaması üzerinden kullanıcı sağlık günlüğünü doktoru ile paylaşabilecektir. | The user will be able to share his health diary with his doctor through the RemMed application. | FR |
| Kullanıcı e-mail ve şifresi ile sisteme giriş yapabilecektir. | The user will be able to login to the system with his e-mail and password. | FR |
| Seyahatname uygulaması tarafından kullanıcının yaptığı hatalara karşın doğru hata mesajları verilmelidir. | Correct error messages should be given by the Travelogue application for the mistakes made by the user. | NFR |
| Mobil tabanlı pazaryeri uygulaması aynı anda en az 1000 kullanıcıya hizmet verebilecektir. | The mobile-based marketplace application will be able to serve at least 1000 users at the same time. | NFR |
| Network alt yapısı sistem kaynaklarının her biri için ortalama en fazla %50’sini kullanmalıdır. | The network infrastructure should use at most 50% of the system resources on average. | NFR |
| Hesabını Bil uygulaması üzerinde yer alan ekranların yenilenme süresi en fazla 5 saniye olacaktır. | The refresh time of the screens on the Know Your Account application will be a maximum of 5 seconds. | NFR |
| Geliştirilecek oyun programı üzerindeki ekran kontrolleri oyuncunun oyunu oynamasına engel olmayacak büyüklükte olmalıdır. | The screen controls on the game program to be developed should be large enough to not prevent the player from playing the game. | NFR |
| Algorithm | F-score | AUC |
|---|---|---|
| NB | .830 | .899 |
| LMT | .914 | .959 |
| RF | .909 | .961 |
| SLR | .899 | .956 |
| JRip | .901 | .887 |
| NBM | .928 | .970 |
| SMO | .913 | .902 |
| LR | .888 | .933 |
| Bagging | .863 | .930 |
| J48 | .826 | .879 |
| Multiclass Classifier | .888 | .933 |
| Algorithm | F-score | AUC | ||
|---|---|---|---|---|
| CFS | GR | CFS | GR | |
| NB | .787 | .856 | .895 | .897 |
| LMT | .842 | .816 | .91 | .962 |
| RF | .842 | .805 | .909 | .962 |
| SLR | .841 | .904 | .915 | .957 |
| JRip | .836 | .905 | .501 | .501 |
| NBM | .837 | .830 | .503 | .503 |
| SMO | .834 | .915 | .792 | .903 |
| LR | .845 | .891 | .917 | .929 |
| Bagging | .832 | .862 | .896 | .929 |
| J48 | .751 | .821 | .786 | .883 |
| Multiclass Classifier | .845 | .830 | .917 | .929 |
| Algorithm | F-score | AUC |
|---|---|---|
| CNN | .937 | .918 |
| LSTM | .914 | .893 |
| Bi-LSTM | .907 | .881 |
| GRU | .926 | .911 |
| Bi-GRU | .915 | .901 |
| Algorithm | F-score | AUC |
|---|---|---|
| BERT | .921 | .971 |
| BERTurk | .954 | .983 |
| DistilBERT | .918 | .968 |
| RoBERTa | .862 | .952 |
| Algorithm | MCC | Kappa |
|---|---|---|
| NB | .663 | .627 |
| LMT | .809 | .808 |
| RF | .797 | .794 |
| SLR | .775 | .772 |
| JRip | .779 | .778 |
| NBM | .841 | .843 |
| SMO | .806 | .806 |
| LR | .753 | .752 |
| Bagging | .694 | .691 |
| J48 | .616 | .615 |
| Multiclass Classifier | .753 | .752 |
| Algorithm | MCC | Kappa | ||
|---|---|---|---|---|
| CFS | GR | CFS | GR | |
| NB | .787 | .856 | .507 | .626 |
| LMT | .842 | .816 | .640 | .816 |
| RF | .844 | .805 | .634 | .805 |
| SLR | .841 | .904 | .637 | .783 |
| JRip | .836 | .905 | .627 | .618 |
| NBM | .837 | .830 | .629 | .626 |
| SMO | .834 | .915 | .621 | .810 |
| LR | .845 | .891 | .646 | .759 |
| Bagging | .832 | .862 | .691 | .688 |
| J48 | .751 | .821 | .431 | .601 |
| Multiclass Classifier | .845 | .834 | .646 | .759 |
| Algorithm | MCC | Kappa |
|---|---|---|
| CNN | .837 | .835 |
| LSTM | .801 | .802 |
| Bi-LSTM | .814 | .817 |
| GRU | .828 | .824 |
| Bi-GRU | .787 | .793 |
| Algorithm | MCC | Kappa |
|---|---|---|
| BERT | .874 | .873 |
| BERTurk | .898 | .897 |
| DistilBERT | .857 | .854 |
| RoBERTa | .789 | .788 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).