Submitted:
16 November 2023
Posted:
16 November 2023
Read the latest preprint version here
Abstract
Keywords:
1. Introduction
2. Related Work
3. Data Description
3.1. Data Collection
3.2. Feature Extraction
4. Experiments
4.1. Does learning data contribute to prediction?
4.2. Can progress be predicted through score forecasting?
4.3. Which performance is more appropriate to be predicted, test or sequence?
5. Conclusion and Future Work
References
- Ali Alkhatlan and Jugal Kalita. 2018. Intelligent Tutoring Systems: A Comprehensive Historical Survey with Recent Developments. arXiv:1812.09628. [CrossRef]
- Elaf Abu Amrieh, Thair Hamtini, and Ibrahim Aljarah. 2016. Mining Educational Data to Predict Student’s academic Performance using Ensemble Methods. International Journal of Database Theory and Application, 9(8), 119-136. [CrossRef]
- Kenneth Li-Minn Ang, Fenglu Ge, and Kah Phooi Seng. 2020. Big Educational Data & Analytics: Survey, Architecture and Challenges. IEEE Access 8 (2020), 116392– 116414. [CrossRef]
- Dinh Thi Ha, Cu Nguyen Giap, Pham Thi To Loan, and Nguyen Thi Lien Huong. 2020. An Empirical Study for Student Academic Performance Prediction Using Machine Learning Techniques. International Journal of Computer Science and Information Security (IJCSIS), 18(3).
- Hashmia Hamsa, Simi Indiradevi, and Jubilant J. KizhaNNethottam. 2016. Student Academic Performance Prediction Model Using Decision Tree and Fuzzy Genetic Algorithm. Procedia Technology, 25, 326-332. [CrossRef]
- Parneet Kaur, Manpreet Singh, and Gurpreet Singh Josan. 2015. Classification and prediction based data mining algorithms to predict slow learners in education sector, Procedia Computer Science, vol. 57, pp. 500–508. [CrossRef]
- Kai-Chih Pai,Bor-Chen Kuo, Chen-Huei Liao, and Yin-Mei Liu. 2021. An application of Chinese dialogue-based intelligent tutoring system in remedial instruction for mathematics learning. Educational Psychology, 137-152. [CrossRef]
- Cristóbal Romero and Sebastián Ventura. 2010. Educational data mining: a review of the state of the art. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 40, 6 (2010), 601–618. [CrossRef]
- Amjad Abu Saa. 2016. Educational data mining & students’ performance prediction. International Journal of Advanced Computer Science and Applications 7, 5, 212–220.
- Amirah Mohamed Shahiri, Wahidah Husain, and Nur’aini Abdul Rashid. 2015. A review on predicting student’s performance using data mining techniques. Procedia Computer Science 72 (2015), 414–422. [CrossRef]
- Zhuojia Xu, Hua Yuan, and Qishan Liu. 2020. Student performance prediction based on blended learning. IEEE Transactions on Education, 64(1), 66–73. [CrossRef]
- Ji Won You. 2016. Identifying significant indicators using LMS data to predict course achievement in online learning. Internet and Higher Education, 29, 23-30. Conference Name:ACM Woodstock conference. [CrossRef]


| Source | Indicator | Description |
|---|---|---|
| Test and Learning | studentID | ID of the student |
| sequenceID | ID of the sequence | |
| topicID | ID of the topic | |
| numTopic | Number of topics in a sequence | |
| Test | testNumProblems | Number of problems in the test |
| testPercentCorrect | Percentage of correct answers in the test | |
| testTimespent | Time spent on the test | |
| firstscore | Score achieved in the first test | |
| lastscore | Score achieved in the last test | |
| scoreLimit | Maximum possible score for a sequence | |
| Learning | behNumProblems | Number of problems in the learning process |
| behNumRight | Number of right problems in the learning process | |
| behNumMissed | Number of missed problems in the learning process | |
| behNumSkipped | Number of skipped problems in the learning process | |
| behTimeSpent | Time spent on the learning process |
| Source | Feature | Description |
|---|---|---|
| Test | scoreRatio | firstScore/scoreLimit |
| Learning | sheetCnt | Count of worksheets |
| topicAvgSheet | Average worksheets per topic | |
| sheetAvgTime | Average spent time per worksheet | |
| Test and Learning | topicCnt | Count of topics |
| topicRatio | topicCnt/numTopic | |
| topicAvgTime | Average spent time per topic | |
| accuracy | Average correct rate of problems | |
| probAvgTime | Average spent time of problems | |
| topicDiff | Difference of topicCnt between learning and test | |
| accuracyDiff | Difference of accuracy between learning and test | |
| topicAvgTimeDiff | Difference of topicAvgTime between learning and test | |
| probAvgTimeDiff | Difference of probAvgTime between learning and test |
| Model | Precision | Recall | F1-measure |
|---|---|---|---|
| DT | 0.570 | 0.568 | 0.569 |
| RF | 0.586 | 0.614 | 0.591 |
| ANN | 0.607 | 0.645 | 0.573 |
| GBDT | 0.604 | 0.644 | 0.580 |
| avg. | 0.592 | 0.618 | 0.578 |
| Model | Precision | Recall | F1-measure |
|---|---|---|---|
| DT | 0.575 | 0.571 | 0.573 |
| RF | 0.616 | 0.639 | 0.619 |
| ANN | 0.625 | 0.654 | 0.598 |
| GBDT | 0.620 | 0.652 | 0.596 |
| avg. (baseline) | 0.609 | 0.629 | 0.596 |
| improvement | 0.017** | 0.011** | 0.018** |
| Model | Precision | Recall | F1-measure |
|---|---|---|---|
| DT | 0.581 | 0.580 | 0.581 |
| RF | 0.624 | 0.654 | 0.602 |
| ANN | 0.630 | 0.657 | 0.588 |
| GBDT | 0.638 | 0.661 | 0.597 |
| avg. | 0.618 | 0.638 | 0.592 |
| improvement | 0.009** | 0.009** | -0.004* |
| Model | Precision | Recall | F1-measure |
|---|---|---|---|
| DT | 0.612 | 0.630 | 0.616 |
| RF | 0.621 | 0.646 | 0.620 |
| ANN | 0.626 | 0.652 | 0.621 |
| GBDT | 0.624 | 0.652 | 0.616 |
| avg. | 0.621 | 0.645 | 0.618 |
| improvement | 0.012** | 0.016** | 0.022** |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).