Submitted:
21 October 2024
Posted:
21 October 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Literature Review
3. Materials and Methods
3.1. SWOT to Predict Risks and the Academic Success
3.2. Data Preparation
3.3. Modelling the Prediction of Academic Success
4 University Case on Prediction Academic Performance
5 Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Srivastava, J.; Srivastava, A. K. Understanding linkage between data mining and statistics. International Journal of Engineering Technology, Management and Applied Sciences 2015, 3(10), 4-12.
- Manjarres, A. V.; Sandoval, L. G. M.; Suárez, M. S. Data mining techniques applied in educational environments: Literature review. Digital Education Review 2018, (33), 235-266. [CrossRef]
- Batool, S.; Rashid, J.; Nisar, M. W.; Kim, J.; Kwon, H. Y.; Hussain, A. Educational data mining to predict students' academic performance: A survey study. Education and Information Technologies 2023, 28(1), 905-971. [CrossRef]
- Romero, C.; Ventura, S. Educational data mining and learning analytics: An updated survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 2020, 10(3), e1355. [CrossRef]
- Hernández-Blanco, A.; Herrera-Flores, B.; Tomás, D.; Navarro-Colorado, B. A systematic review of deep learning approaches to educational data mining. Complexity 2019. [CrossRef]
- Behr, A.; Giese, M.; Teguim Kamdjou, H. D.; Theune, K. Dropping out of university: a literature review. Review of Education 2020, 8(2), 614-652. [CrossRef]
- Alyahyan, E.; Düştegör, D. Predicting academic success in higher education: literature review and best practices. International Journal of Educational Technology in Higher Education 2020, 17, 1-21. [CrossRef]
- Trakunphutthirak, R.; Cheung, Y.; Lee, V. C. (2019, July). A study of educational data mining: Evidence from a thai university. In Proceedings of the AAAI Conference on Artificial Intelligence, USA, 33 (01), 734-741, 2019, July. [CrossRef]
- Rajalaxmi, R. R.; Natesan, P.; Krishnamoorthy, N.; Ponni, S. Regression model for predicting engineering students academic performance. International Journal of Recent Technology and Engineering 2019, 7(6S3), 71-75.
- Križanić, S. Educational data mining using cluster analysis and decision tree technique: A case study. International Journal of Engineering Business Management 2020, 12, 1847979020908675. [CrossRef]
- Chen, J.; Zhao, J. An Educational Data Mining Model for Supervision of Network Learning Process. International Journal of Emerging Technologies in Learning 2018, 13(11). [CrossRef]
- Doko, E.; Bexheti, L. A.; Hamiti, M.; Etemi, B. P. Sequential Pattern Mining Model to Identify the Most Important or Difficult Learning Topics via Mobile Technologies. International Journal of Interactive Mobile Technologies 2018, 12(4). [CrossRef]
- Paiva, R.; Bittencourt, I. I.; Lemos, W.; Vinicius, A.; Dermeval, D. Visualizing learning analytics and educational data mining outputs. In Artificial Intelligence in Education: 19th International Conference, AIED 2018, London, UK, June 27–30, 2018, Proceedings, Part II 19 (pp. 251-256). Springer International Publishing. [CrossRef]
- Almasri, A.; Alkhawaldeh, R. S.; Çelebi, E. Clustering-based EMT model for predicting student performance. Arabian Journal for Science and Engineering 2020, 45, 10067-10078. [CrossRef]
- Khasanah, A. U. A comparative study to predict student’s performance using educational data mining techniques. In IOP Conference Series: Materials Science and Engineering, 215(1), 012036, 2017, June. IOP Publishing. [CrossRef]
- Seidel, E.; Kutieleh, S. Using predictive analytics to target and improve first year student attrition. Australian Journal of Education 2017, 61(2), 200-218. [CrossRef]
- Arulkadacham, L.; McKenzie, S.; Aziz, Z.; Chung, J.; Dyer, K.; Holt, C.; Mundy, M. General and unique predictors of student success in online courses: A systematic review and focus group. Journal of University Teaching & Learning Practice 2021, 18(8), 07. [CrossRef]
- Yokoyama, S. Academic self-efficacy and academic performance in online learning: A mini review. Frontiers in psychology 2019, 9, 2794. [CrossRef]
- Doménech-Betoret, F.; Abellán-Roselló, L.; Gómez-Artiga, A. Self-efficacy, satisfaction, and academic achievement: the mediator role of Students' expectancy-value beliefs. Frontiers in psychology 2017, 8, 1193. [CrossRef]
- Nasir, M.; Iqbal, S. Academic Self Efficacy as a Predictor of Academic Achievement of Students in Pre Service Teacher Training Programs. Bulletin of Education and Research 2019, 41(1), 33-42.
- Quinn, R. J.; Gray, G. Prediction of student academic performance using Moodle data from a Further Education setting. Irish Journal of Technology Enhanced Learning 2020, 5(1). [CrossRef]
- Hellas, A.; Ihantola, P.; Petersen, A.; Ajanovski, V. V.; Gutica, M.; Hynninen, T.; Liao, S. N. Predicting academic performance: a systematic literature review. In Proceedings companion of the 23rd annual ACM conference on innovation and technology in computer science education, 175-199, 2018, July. [CrossRef]
- Yildiz, M.; Börekci, C. Predicting Academic Achievement with Machine Learning Algorithms. Journal of Educational Technology and Online Learning 2020, 3(3), 372-392. [CrossRef]
- Phauk, S.; Okazaki, T. Integration of Educational Data Mining Models to a Web-Based Support System for Predicting High School Student Performance. International Journal of Computer and Information Engineering 2021, 15(2), 131-144.
- Chapman, P.; Clinton, J.; Kerber, R.; Khabaza, T.; Reinartz, T.; Shearer, C.; Wirth, R. (2000). CRISP-DM 1.0: Step-by-step data mining guide.
- Qiu, F.; Zhang, G.; Sheng, X.; Jiang, L.; Zhu, L.; Xiang, Q.; Chen, P. K. Predicting students’ performance in e-learning using learning process and behaviour data. Scientific Reports 2022, 12(1), 453. [CrossRef]
- Shreem, S. S.; Turabieh, H.; Al Azwari, S.; Baothman, F. Enhanced binary genetic algorithm as a feature selection to predict student performance. Soft Computing 2022, 26(4), 1811-1823. [CrossRef]
- Baashar, Y.; Alkawsi, G.; Mustafa, A.; Alkahtani, A. A.; Alsariera, Y. A.; Ali, A. Q.; Tiong, S. K. Toward predicting student’s academic performance using artificial neural networks (ANNs). Applied Sciences 2022, 12(3), 1289. [CrossRef]
- Cruz-Jesus, F.; Castelli, M.; Oliveira, T.; Mendes, R.; Nunes, C.; Sa-Velho, M.; Rosa-Louro, A. Using artificial intelligence methods to assess academic achievement in public high schools of a European Union country. Heliyon 2020, 6(6), e04081. [CrossRef]
- Yağcı, M. Educational data mining: prediction of students' academic performance using machine learning algorithms. Smart Learning Environments 2022, 9(1), 11. [CrossRef]
- Beckham, N. R.; Akeh, L. J.; Mitaart, G. N. P.; Moniaga, J. V. Determining factors that affect student performance using various machine learning methods. Procedia Computer Science 2023, 216, 597-603. [CrossRef]
- Göktepe Yıldız, S.; Göktepe Körpeoğlu, S. Prediction of students’ perceptions of problem solving skills with a neuro-fuzzy model and hierarchical regression method: A quantitative study. Education and Information Technologies 2023, 1-39. [CrossRef]
- Honicke, T.; Broadbent, J. The influence of academic self-efficacy on academic performance: A systematic review. Educational research review 2016, 17, 63-84. [CrossRef]
- Weka Wiki Homepage. https://waikato.github.io/weka-wiki/downloading_weka/, last accessed 2023/05/27.
- Ortiz-Lozano, J. M.; Rua-Vieites, A.; Bilbao-Calabuig, P.; Casadesús-Fa, M. University student retention: Best time and data to identify undergraduate students at risk of dropout. Innovations in education and teaching international 2018, 57(1), 74-85. [CrossRef]
- Bouckaert, R. R.; Frank, E.; Hall, M.; Kirkby, R.; Reutemann, P.; Seewald, A.; Scuse, D. WEKA manual for version 3-8-3. The University of Waikato 2018, 1-327.







| STRENGTHS | WEAKNESS |
|---|---|
| A virtual learning environment Academic information system Experience in implementing an early warning system Highly qualified and competent teachers |
There is a lack of information about the current situation of learners Uneven assessment of learners Inconsistent monitoring of learner progress Student dropout Students are stressed at the end of the semester |
| OPPORTUNITIES | THREATS |
| Use personalized administrative and learning process data Digitize the monitoring of learners' progress Optimize the use of big educational data Improve the system of providing academic support |
Ensuring learner data protection, privacy and confidentiality Risk of wasting information extracted during data mining High load on the Moodle server when retrieving data from the database |
| Variable | Value |
|---|---|
| TP1_access_week | The number of student logins to the module "Research Project 1" |
| TP1_clicks_week | The number of student clicks in the module "Research Project 1" |
| VMP_access_week | The number of student logins to the module "Basics of Virtual Learning" |
| VMP_clicks_week | The number of student clicks in the module "Basics of Virtual Learning" |
| key | Student identity pseudonymization key (125 students) |
| success | A class variable with F representing academic failure and T representing academic success |
| Algorithm | Data | Precision | Recall | F-Measure | ROC Area | Class |
|---|---|---|---|---|---|---|
| Decision tree | 5 weeks | 0,5 | 0,278 | 0,357 | 0,531 | F |
| 0,752 | 0,888 | 0,814 | 0,531 | T | ||
| 6 weeks | 0,556 | 0,417 | 0,476 | 0,676 | F | |
| 0,786 | 0,865 | 0,824 | 0,676 | T | ||
| 7 weeks | 0,441 | 0,417 | 0,429 | 0,592 | F | |
| 0,769 | 0,787 | 0,778 | 0,592 | T | ||
| 8 weeks | 0,429 | 0,417 | 0,423 | 0,598 | F | |
| 0,767 | 0,775 | 0,771 | 0,598 | T | ||
| Bayesian classifier | 5 weeks | 0,361 | 0,611 | 0,454 | 0,646 | F |
| 0,781 | 0,562 | 0,654 | 0,646 | T | ||
| 6 weeks | 0,418 | 0,778 | 0,544 | 0,716 | F | |
| 0,862 | 0,562 | 0,68 | 0,716 | T | ||
| 7 weeks | 0,41 | 0,694 | 0,515 | 0,737 | F | |
| 0,828 | 0,596 | 0,693 | 0,737 | T | ||
| 8 weeks | 0,414 | 0,667 | 0,511 | 0,735 | F | |
| 0,821 | 0,618 | 0,705 | 0,735 | T | ||
| Random forest | 5 weeks | 0,9 | 0,25 | 0,391 | 0,648 | F |
| 0,765 | 0,989 | 0,863 | 0,648 | T | ||
| 6 weeks | 0,857 | 0,333 | 0,48 | 0,734 | F | |
| 0,784 | 0,978 | 0,87 | 0,734 | T | ||
| 7 weeks | 0,824 | 0,389 | 0,528 | 0,772 | F | |
| 0,796 | 0,966 | 0,873 | 0,772 | T | ||
| 8 weeks | 0,765 | 0,361 | 0,491 | 0,716 | F | |
| 0,787 | 0,955 | 0,863 | 0,716 | T | ||
| Support vector classifier | 5 weeks | 0,556 | 0,139 | 0,222 | 0,547 | F |
| 0,733 | 0,955 | 0,829 | 0,547 | T | ||
| 6 weeks | 0,692 | 0,25 | 0,367 | 0,603 | F | |
| 0,759 | 0,955 | 0,846 | 0,603 | T | ||
| 7 weeks | 0,737 | 0,389 | 0,509 | 0,666 | F | |
| 0,792 | 0,944 | 0,862 | 0,666 | T | ||
| 8 weeks | 0,737 | 0,389 | 0,509 | 0,666 | F | |
| 0,792 | 0,944 | 0,862 | 0,666 | T | ||
| K-nearest neighbors classifier | 5 weeks | 0,381 | 0,444 | 0,41 | 0,561 | F |
| 0,759 | 0,708 | 0,733 | 0,561 | T | ||
| 6 weeks | 0,357 | 0,417 | 0,385 | 0,539 | F | |
| 0,747 | 0,697 | 0,721 | 0,539 | T | ||
| 7 weeks | 0,425 | 0,472 | 0,447 | 0,613 | F | |
| 0,776 | 0,742 | 0,759 | 0,613 | T | ||
| 8 weeks | 0,5 | 0,5 | 0,5 | 0,655 | F | |
| 0,798 | 0,798 | 0,798 | 0,655 | T |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).