Educational data mining, student academic performance pre- diction, prediction methods, algorithms and tools: An overview of reviews

This overview study set out to compare and synthesise the findings of review studies conducted on predicting student academic performance (SAP) in higher education using educational data mining (EDM) methods, EDM algorithms and EDM tools from 2013 to June 2020. It conducted multiple searches for suitable and relevant peer-reviewed articles on two online search engines, on nine online databases, and on two online academic social networks. It, then, selected 26 eligible articles from 2,050 articles. Some of the findings of this overview study are worth mentioning. First, only 2 studies explicitly stated their precise sample sizes with maths and science as the two most mentioned subject areas. Second, 16 review studies had purposes related to either EDM techniques, EDM methods, EDM models, or EDM algorithms employed to predict SAP and student success in the higher education sector. Third, there are six commonly used typologies of input variables reported by 26 review studies, of which student demographics was the most commonly utilised variable for predicting SAP. Fourth and last, seven common EDM algorithms employed for predicting SAP were identified, of which Decision Tree emerged both as the most used algorithm and as the algorithm with the highest prediction accuracy rate for predicting SAP.


Introduction
The last few years have witnessed an exponential increase in review studies exploring educational data mining (EDM) methods, algorithms and tools for predicting student academic performance (SAP) [1][2][3][4][5][6]. This is the case for diverse disciplinary fields, even though fields such as computer science and engineering seem to have conducted more such studies than others [7]. Most EDM review studies on predicting SAP have been conducted as either reviews [2,5,[8][9][10][11][12][13]; literature reviews [14][15][16]; systematic literature reviews [17][18][19][20]; systematic reviews [4,[21][22][23][24][25]; review syntheses [26]; or surveys [27,28]. While these review study types are not exhaustive, they represent a broad spectrum of the types of review studies that the current paper was able to locate. In addition, whereas review studies have been conducted on predicting SAP using EDM methods, algorithms and tools, there is a dearth of overviews of review studies in this particular field of EDM. To this end, the present paper is intended to fill this gap and contribute to this area of EDM, particularly against the backdrop of an exponential rise in review studies focusing on this area.
Moreover, an overview of reviews on predicting SAP using EDM methods, algorithms and tools is necessary since review studies mostly review aspects related to SAP prediction in varying degrees. Additionally, elsewhere, Kim et al. [29] maintain that, generally, review studies vary in their scope, focus and comprehensiveness. Thus, harnessing and synergising the different scopes and foci of various review studies predicting SAP in higher education using EDM methods, algorithms and tools is vital. This is what the current paper attempts to do.

Contextualising Issues
This paper uses an overview of reviews in the same sense as a review of reviews. In an overview of reviews (hereafter an overview or an overview study), review studies or aspects featuring in review studies become key units or foci of analysis as opposed to aspects of primary studies [30,31]. There are different terms used to refer to a review of reviews. These include review of reviews, second-order review, umbrella review, tertiary review, meta-meta-analysis, synthesis of meta-analysis, synthesis of systematic reviews, summary of systematic reviews, or systematic review of systematic reviews [30][31][32][33][34]. These terms constitute typologies of overviews. These typologies reflect the roles played by the respective overviews and the purposes these overviews are meant to serve. Nonetheless, a central feature of all these overview typologies is synthesising findings derived from review studies in line with their traditions. To this effect, Pieper et al. [33] contend that there is no standard definition of a review of reviews and that the term itself is still not well defined (also cf. Hunt et al., 2018). The current overview falls within three strands: narrative, thematic and mixed-methods overview (cf. Kim et al., 2018).
Benefits of utilising overviews are: • retrieving, identifying, assessing and integrating findings from several review studies • leveraging previous research syntheses • broadening evidence synthesis questions which cannot be posed through reviews [35], or defining research problems in broader terms • aggregating the evidence provided by multiple reviews or contrasting multiple treatments on the same topic • monitoring trends and changes in research over time • contributing to the knowledge base that transcends the one reported in existing individual reviews • identifying a gap in existing reviews [31,30,33]. There are, however, challenges associated with overviews. Four of these challenges are, overlap, bias and non-up-to-dateness. The first challenge relates to an instance in which two (or more) same studies (and data) appear in one or more reviews, while the second challenge has to do with biased reporting. Biased reporting may entail a bias towards particular (included) review studies, and under-reporting or over-reporting of aspects of review studies at the expense of others; it may also involve inconsistent reporting. This risk of bias can be exacerbated by missing or inadequate data from review studies or from primary studies (31,34,35]. The third challenge arises when review studies are not up-to-date, or when outdated review studies are considered in lieu of the recent ones ( [31]. And, the fourth one is a lack of methodological rigour which is mostly attributable to the paucity of methodological standards and reporting guidelines [30,31,33].

Predicting Student Academic Performance Using EDM Techniques
Student academic performance (SAP) is a crucial construct employed to determine student academic success at different educational levels [5,23]. Even though it has multiple definitions [36], at a basic level, SAP is the performance that students display in their academic tasks (e.g., assignments, tests and examinations). It is often reflected in students' past cumulative grade point average (CGPA)/grade point average (GPA) in a previous semester and in students' expected GPA in the existing semester. If the term performance is disaggregated from the phrase student academic performance, it embodies achievement in relation to assignments and courses, continuous progress in programmes, and a successful completion of programmes [2,18]. Moreover, it entails persistence, retention, progression, wastage) [37], and success or progress [38,39]. In this sense, student academic performance should be seen in the same way as student academic achievement [14]. However, SAP is a complex construct, and in this regard, there are multiple factors that impact on and affect it. These include the historical academic performance and the socio-economic background of students.
In this regard, some of the factors (also known as attributes) employed to predict SAP are: academic factors (historical and current); student demographics; socio-economics factors; psychological factors; student e-learning activities; student environments; and extracurricular activities (18,24]. The superordinate factors listed in the preceding set are often utilised to predict SAP by most scholars [2,14,18,24,36,37,40,41]. These superordinate factors are further categorised into specific subordinate factors with the former serving as input variables or performance features, and with the latter serving as output variables or performance metrics [18]. Nonetheless, at times there are overlaps between the superordinate and subordinate factors as certain scholars tend to conflate them [7,14,18,24,36,41].
Moreover, certain methods (or tasks) such as association rule mining, clustering, classification and regression are used for building models for predicting SAP. Such methods are at times referred to as techniques [18,38] while Saa et al. [4] call them EDM approaches. In this way, classification tends to be the predominantly used method. Furthermore, there are algorithms that are employed to predict SAP. Among them are Artificial Neural Network (ANN), Bayesian Network (BN), Decision Tree (DT), K-Nearest Neighbour (K-NN), K-Means; Naïve Bayesian classifiers, Neural Network (NN), and Support Vector Machine (SVM) [4,7,24,37,40,41]. Other examples of such algorithms are Random Tree (RT); Random Forest (RF); REPTree; J48; LADTree; Sequential minimal optimisation (SMO) [4]; JRip; OneR; CART; C4.5; and Iterative Dichotomiser 3 (ID3) [2,4,38]. Still, other include Markhov Networks and Collaborative Multi-Regression models [7]. In certain instances, these algorithms are referred to as EDM techniques [4,7,36], or as tasks or as methods [36]. The choice of prediction algorithms is determined by SAP outcomes to be predicted. For instance, classification algorithms such as DT, NN and NB classifiers are commonly used for predicting a binary outcome like pass/fail at a certain degree of probability [4,7]. By contrast, SVM and linear regression are often employed for predicting numerical scores ( [7].
Among other things, EDM in higher education is used for predicting SAP, monitoring student progress, assessing student learning and getting insights into students' learning experiences [38]. Lastly, one of the major arguments is that predicting SAP early enough helps instructors review instruction and take appropriate actions with a view to improving students' success trajectory. Levels targeted for predicting SAP include: a course level (predicting SAP in a given course); a year level (predicting SAP at the end of a given year); an examination level (predicting SAP in an examination for a specific course); and a degree level (predicting SAP at the time of completing a degree) [14].

Purpose of the Study
As mentioned earlier, at the time of writing this paper, studies on overviews of reviews on predicting SAP using EDM methods, algorithms and tools were not evident. Only review studies were evident. Again, as noted earlier, most of such reviews fall into the following categories: reviews; literature reviews; systematic literature reviews; systematic reviews; review syntheses; and surveys. Against this backdrop, the purpose of this paper is to compare and synthesise findings of review studies [35] conducted on predicting SAP in higher education through utilising EDM methods, algorithms and tools from 2013 to June 2020. The major focus is on review studies related to the higher education sector. The following served as research questions (RQs) for this study: • RQ1: What are the primary purposes of the review studies investigated in this overview? • RQ2: What common input (predictor) and common output (target) variables do these review studies employ to predict SAP?
• RQ3: What common educational data mining (EDM) techniques (or methods) and algorithms do they employ in predicting SAP? • RQ4: What algorithms are reported to have the highest prediction accuracy for SAP? • RQ5: What common EDM tools do these study employ in predicting SAP? • RQ6: What are the key results of these review studies? In addition to the purpose and research questions mentioned above, this overview study was informed by the following review guidelines: literature search strategy; eligibility criteria; selection of studies; and data extraction, coding and inter-coder reliability. These guidelines were adapted from the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) [42,43].

Literature Search Strategy
The search strategy for relevant review studies was conducted online from March 2020 to June 2020, and started by locating search engines, databases, and academic social networking sites. Subsequently, two online search engines (Google and Bing), nine online databases (Google Scholar, Microsoft Academic, Semantic Scholar, IEEE Xplore, ERIC, ScienceDirect, Emerald; JSTOR, SpringerLink), and two online academic social networks (ResearchGate and Academia.edu), were identified (see Figure 1). Search strings were arranged into super-and sub-strings in keeping with the major area of focus of the overview: predicting SAP through using EDM methods, algorithms, and tools. These search strings consisted of the following keywords: predicting student academic performance; educational data mining techniques; educational data mining algorithms; and educational data mining software tools. To ensure that a wide range of review studies on the major focus area of this overview were covered in all the search combinations, two commonly used Boolean operators, AND and OR, together with parentheses and double quotation marks (where necessary), were employed in the search strategy. Examples of these search combinations were as follows: • predicting student academic performance AND educational data mining techniques AND educational data mining algorithms AND educational data mining software tools • predicting student academic performance OR educational data mining techniques OR educational data mining algorithms OR educational data mining software tools • (predicting student academic performance) AND (educational data mining techniques) AND (educational data mining algorithms) AND (educational data mining software tools) • "predicting student academic performance" OR "educational data mining techniques" OR "educational data mining algorithms" OR "educational data mining software tools" In certain instances, the word, techniques, was replaced with methods and tasks. The afore-said keyword combinations, together with their relevant iterations, were queried in the three search engines, in the eight online databases, and in the two online academic social networking sites mentioned earlier. In relation to the Google search engine and Google Scholar, one-word keywords followed by a proximity connector (AROUND) and a number in parenthesis were queried as follows: EDM AROUND(4) algorithms AROUND (4). Different permutations of the keywords aligned to the key focus area of this overview were employed. Moreover, dependency and snowball search strategy was employed based on the bibliographies of the journal articles obtained from the three sets of online search platforms.

Eligibility Criteria and Selection of Studies
The criteria for including and excluding review studies are as listed below. They were formulated to respond to the major focus area of the current overview.
• review studies focusing on predicting SAP using EDM methods (techniques or tasks), algorithms and tools; • focus on higher education; • review studies published between 2013 and June 2020; • review studies published in peer-reviewed journals and by (internationally) recognised conference organisations; • mention of a specific years/duration covered (e.g., 2010 to 2015); and • review studies published in English Review studies were identified and selected by following a four-phase selection process informed by the PRISMA approach as illustrated in Figure 1. One of the key aspects of this approach is to ascertain that there is clarity and transparency in the search and selection processes [42,44]. The first phase involved screening articles, which were obtained from the three sets of online search platforms by querying a combination of search strings mentioned earlier. This phase yielded 2,500 articles. The second phase entailed screening these articles by reviewing their titles. This resulted in 260 articles being retained. Thereafter, the third phase was conducted during which 200 irrelevant and duplicate articles were eliminated by reviewing their abstracts and keywords. In the fourth phase, 34 irrelevant articles were identified and excluded after review their contents and foci, resulting in 26 full-text articles judged as relevant being retained. These 26 articles served as the major source of data sets for the current overview.

Data Extraction, Coding and Inter-Rater Reliability
Data sets, based on the purpose and on the major focus area of the overview, were extracted from 26 full-text articles mentioned above. A coding scheme consisting of categories based on 14 specific features of the major focus area (see Table 1) was developed. Examples of these categories are: total sample size; purpose of review; input variables; output variables; and EDM techniques. Raters used this coding scheme to extract data from the 26 articles, code them, and match them to each of these categories. To ensure data extraction and data coding consistency, three raters extracted and coded data. The coding protocol used was based on Miles and Huberman's [45] inter-rater reliability (IRR), which employs the following formula:

reliability = number of agreements number of agreements + disagreements
In keeping with this formula, the three raters had a mean IRR of 77% agreement for all the data they had coded for the 14 categories. An IRR of 77% agreement is deemed to be sufficiently reliable [45][46][47][48]. Table 1. Summary of the key aspects of each review article Please see the table submitted as a zip file.

Data Analysis
Two related and complimentary techniques were used to analyse data sets: content analysis and thematic analysis. The choice of these two analytic approaches was informed by the types of data sets extracted from the 26 articles. Content analysis lent itself well to quantitatively representing categories and themes extracted from the data, while thematic analysis was employed to qualitatively present these categories and themes [49,50].

Findings
The findings presented in this section of the overview are grounded on the data extracted from the 26 full-text articles and are informed by the manner in which the extracted data were codified as highlighted in the relevant section above. Additionally, the findings respond to the six research questions stated earlier.

A Panoramic View of the Twenty-Six Review Studies
Of the 26 review studies investigated, 8 were reviews; 6 were systematic reviews; 4 were systematic literature reviews; 3 were literature reviews and surveys, apiece; and the last 2 were a review and synthesis and a comparative analysis, each (see Figure 2). In all, there were seven different types of reviews, with classical reviews as a typology constituting the most of these review studies.

Figure 2. Types of review studies reviewed
Additionally, these 26 review studies had their authors from diverse albeit, in some cases, the same countries of origin. For instance, on the one hand, as depicted in Figure  3, 6 reviews were written by authors based in India, while 2 studies each were written by authors from Malaysia, Italy, Saudi Arabia and Spain, respectively. On the other hand, 6 reviews had authors from 6 different single countries; 5 reviews had authors from 5 different dual countries; and one review had its authors from multiple (4) countries.  Table 1). The study with the longest duration (longest time span) is review study 12, which covered a 22-year duration (1999-2019) (see Figure 4; also see Table 1). It contrasts with review study 11, whose duration is 3 years (2007-2010). The study that had the most articles is review study 16, which reviewed 402 articles. Its converse is review study 13, which focused on 6 articles. There are 3 studies that mentioned precise subject areas, with natural sciences (maths and science) mentioned by all the 3 studies, and computer science and engineering appearing in 2 studies. By contrast, 5 studies mentioned vague subject areas, while 18 studies did not mention their subject areas. In this case, 2 studies provided precise sample sizes, and collectively, their sample sizes totalled 44,739 participants. Eight studies provided vague sample sizes, with 16 having not stated their sample sizes.

Purposes of the Review Studies
As illustrated in Figure 5 (also see Table 2), 16 review studies had purposes focusing on EDM techniques, EDM methods, EDM models, or EDM algorithms used to predict SAP and student success in higher education. Of these review studies, 12 explicitly mentioned SAP or academic/student performance in their purposes, with three of them mentioning both SAP and dropout prediction. Of the remaining four, three made reference to predictive models, while the last one referred to predicting student success. For the remaining ten review studies, six had their purposes on reviewing or surveying EDM techniques and tools, and three had their purposes on student dropout prediction. The other remaining review study did not mention its purpose.

Common Input (Predictor) Variables and Common Output (Predicted) Variables Employed as Reported by Review Studies
Six typologies of input (predictor) variables emerged as the common typologies of input variables used for predicting SAP by the reviewed studies. These are pre-university academic factors; university academic factors; student demographics; family factors; psychological factors; and student e-learning activities (see Table 3). Of these collective factors, student demographics appears in 26 review studies. It is followed by both university academic factors and psychological factors, which appear in 15 review studies. High school background and high school performance scores rank as the most common pre-university academic factors employed, whereas course assessment scores are reported as the commonly used aggregated attribute for university academic factors. For student demographics, gender and age are the two common attributes reported to have been used, while family is the common attribute reported to have been employed for family factors. The common attribute for psychological factors are surveys, and student log data is the commonly used factor for student e-learning activities.

Table 3. Common (input) predictor variables employed as reported by review studies
As regards the common output variables, both pre-university academic factors and university academic factors emerged as the two frequently used attributes under these types of SAP predictor variables (see Table 4). Table 4. Common output variables employed as reported by review studies

Common EDM Methods Employed as Reported by Review Studies
There are seven commonly used EDM methods for predicting SAP as reported by review studies (see Table 5). Of these, the most commonly used EMD method is classification, which is reported by16 review studies. It is followed by clustering, which is reported by 13 review studies. Both association rule and regression have a tie as they are reported by 11 review studies, apiece. Naïve Bayes is the least commonly used as it is referenced by only 6 review studies.

Common EDM Algorithms (Classifiers) and Common EDM Software Tools Employed as Reported by Review Studies
Pertaining to the commonly used EDM algorithms for predicting SAP, there are seven algorithms referenced by the reviewed studies (see Table 6). Of these seven EDM algorithms, DT is the most commonly used algorithm as it is mentioned and cited by 18 review studies. It is followed by ANN (n = 16), SVM (n = 15) and NB (n = 12), respectively. Naïve Bayes classifiers is the least commonly used EMD algorithm for predicting SAP. However, when Bayesian classifiers are clustered together, they emerge as the most frequently utilised EDM algorithms as reported by 21 review studies.

Table 6. Common EDM algorithms (classifiers) employed as reported by review studies
Seven of the review studies reported on and mentioned the EDM techniques or algorithms with the highest student performance prediction accuracy rate. Of these studies, DT is reported to have the highest prediction accuracy rate by four studies (a 100% and a 99% predication accuracy rate by one study). It is followed by Naïve Bayes, which has a mixed prediction accuracy rate: two studies rate it as having a high prediction accuracy rate, one of which rates it to have a prediction accuracy rate of 100%), whereas two studies rate it as having a low prediction accuracy rate (a 76% prediction accuracy rate in one study).
In this context, three EDM software tools are reported as frequently used for predicting SAP. These are WEKA, SPSS and RapidMiner, with WEKA as the most commonly used of the three EDM software tools (see Table 7). Table 7. Common EDM software tools as reported by review studies

Discussion
This section discusses the findings of the current overview study as presented in the findings section. Most importantly, the discussion of the findings is structured in response to the research questions (RQs) of the study. As pointed out above, twenty-six review studies constituted the focal point of the present overview. Except for four studies, the rest (n = 22) were reviews of different typologies: classical (n = 8), systematic (n = 6), systematic literature (n = 4), literature (n = 3), and synthesis (n = 1) reviews. In their review of reviews, Kim et al. [29] investigated qualitative reviews (narrative and thematic reviews) and quantitative reviews (systematic and meta-analysis reviews) as part of the articles included in their review of reviews (n = 171) in hospitality and tourism. In a different but related instance, McKenzie and Brennan [35] highlight that overviews focus on and integrate results from multiple systematic reviews. They even make mention of synthesis overviews, which are overviews of synthesis reviews. Likewise, Polanin et al. [31] argue that a proliferation of systematic reviews in education research has resulted in an increase in overviews of reviews.
As stated under the findings section, the authors of the 26 review studies had diverse countries of origin. Nonetheless, some authors were from the same countries of origin, while 6 review studies shared India as the same country of origin (see Figure 3). Two studies that also had authors and countries of origin as some of their focal points of analysis are Ifenthaler and Yau's [51] and Saa et al.'s [4] review studies. Ifenthaler and Yau's [51] study, which was a systematic review study of 46 publications, had the United States of America as the country with the most authors (n = 13). Contrarily, Saa et al.'s [4] study does not provide any other information about authors' and studies' countries of origin other than having this aspect tabulated as one of the items in a given table.
Concerning subject areas, maths and science featured in all the 3 studies that mentioned their subject areas, and computer science and engineering were mentioned by 2 studies. In this case, 2 studies mentioned sample sizes, which together, totalled 44,739. A review of reviews in a different but related area that offers subject areas on which its reviews focused is Kim et al. [29]. Of the 13 reviews this overview reviewed, economics and finance (n = 29), customer behaviour (n = 24) and marketing (n =22) are reported as the top three subject areas mentioned by the reviewed studies, respectively. The overview mentions that sample sizes of its 171 reviews ranged from less than 10 to more than 10,000, with systematic reviews having the highest sample sizes. To this end, Ifenthaler and Yau's [51] systematic review of 46 publications, highlights both computer science and engineering as the two top subject areas, sequentially. Two of its publications with the most sample sizes, and which were conference papers, had 474,977 and 85,281 sample sizes, each. In the current overview, the 3 reviews that mentioned their subject areas were a comparative analysis, a systematic literature review and a review. And 2 reviews that stated their precise sample sizes were both a systematic literature review and a literature review (see Table 1).
Pertaining to the purposes of the 26 reviews, it emerged that the purposes of 16 reviews had to do with either EDM techniques, EDM methods, EDM models, or EDM algorithms utilised to predict SAP and student success in higher education. By contrast, of the remaining 10 studies, six reviewed or surveyed EDM techniques and tools, whereas three focused on student dropout prediction. The last one never stated its purpose. A study that had purposes (or objectives) as one of its focal points of analysis is Khanna et al.'s [23] systematic review, which had reviewed 13 articles. Among the purposes of the 13 articles it analysed, educational data mining (EDM) methods or techniques employed for predicting student performance featured prominently in the purposes of 10 of these articles. The other study, Papamitsiou and Economides' [20] systematic literature review of 40 articles, had six purposes, of which prediction of student performance was the second most common purpose after student behaviour modelling. Another study that profiled the purposes (or contributions) of the 10 articles it reviewed is Ganesh and Christy's [27] survey. Three of these profiled purposes had to do with predicting student performance, with two more dealing with student dropout and student retention. In some of the purposes highlighted by these two review studies lies a confluence of purposes, especially pertaining to predicting SAP using EDM methods or techniques, between them and the present overview.
Of the six typologies of input variables reported to have been used by the 26 review studies, student demographics emerged as the most commonly used input variable for predicting SAP, with both gender and age as the most common attributes. It was followed by both university academic factors and psychological factors, with course assessment scores and surveys as the most common attributes for each of these two collective factors, respectively. In Khasanah's [2] review of 10 articles, student personal information and family information were the two most popular collective factors used, with gender and age, and father education and mother education, as their most common attributes, each. Pre-university (high school results) and university (GPA and assessment grades) factors and student demographics (gender and age) are the most influential factors reported in Alyahyan and Düştegör's [14] literature review of 19 articles. In a different but related context, Liz-Domínguez et al.'s [19] systematic literature review of 25 articles on predicting early warning systems in higher education, student log data featured as the common input attribute employed by 10 of these articles. Moreover, Mat et al. [36] argue that gender and assessment marks were among the key attributes used by studies (number not provided) they reviewed. For output variables, both pre-university academic factors and university academic factors were the two frequently employed cluster of factors with reference to these types of SAP predictor variables.
As characterised in the findings section, there were seven common EDM methods that were identified for predicting SAP as reported by the 26 review studies. The three most commonly used methods were classification, clustering, and association rule and regression (last two had a tie), respectively, while Naïve Bayes was the least utilised method. Similarly, both classification and clustering were the most popularly used EDM methods in Papamitsiou and Economides' [20] systematic literature review, while regression was the third most used method. Classification was found to have been the most popularly used EDM method in Ganesh and Christy's [27] survey of 10 articles, with association rule and clustering as the second and third most used methods, successively. Again, classification was found to be the top-most utilised EDM method (n = 40) by Del Río and Insuasti's [11] review study of 56 articles. In the same vein, classification was the top-most used EDM method (44%) in Alyahyan and Düştegör's [14] literature review of 19 articles. However, both regression and clustering were the least employed methods at 3% and 2%, apiece. On the contrary, association rules and clustering featured as the two most utilised EDM methods in Manjarres et al.'s [15] literature review of 127 papers. Once more, Mat et al. [36] contend that classification, regression and clustering were the common EDM methods used in the articles they reviewed in their study.
In relation to the seven EDM algorithms identified from the 26 review studies, Decision Tree (DT) was found to be the most commonly employed for predicting SAP, with Artificial Neural Network (ANN) and Support Vector Machine (SVM) being the second and third most used algorithms, respectively, while Naïve Bayes (NB) was the least used algorithm. Nonetheless, as a cluster, Bayesian classifiers were the most frequently utilised, overall. One review study that found DT to be the most used EDM algorithm is Cui et al.'s [10] review of 121 articles. It was referenced by 46 of these articles, followed by Naïve Bayes (n= 32), SVM (n = 26), and neural networks (NN) and multi-layer perceptron (MLP) (n = 26). Similarly, DT had a frequency of 49 as opposed to two of its nearest algorithms, Bayesian classifiers (f = 36) and NN (f = 29) in Agrusti et al.'s [21] systematic review of 73 studies. Likewise, Alyahyan and Düştegör's [14] literature review (n = 19 articles) single out DT algorithms (e.g., J48, C4.5, Random tree, and REPTree) as the top-most used algorithms (44%), followed by both Bayesian algorithms (19%) and ANN (10%). To this end, Kumar et al. [52] concur that Decision Tree (DT), Naïve Bayes (NB) and Artificial Neural Network (ANN) were among the mostly utilised EDM algorithms for predicting student academic performance (SAP) in their literature survey.
In another scenario, DT and Naïve Bayesian classifiers (as categories) had the frequencies of 35 (24.8%) and 14 (9.9%) out of the total number of 141 algorithms identified from 34 articles in Saa et al.'s [4] systematic review. However, when viewed as individual algorithms, Naïve Bayesian classifiers had the frequency of 13 (38.2%), followed by SVM with the frequency of 8 (23.5%). DT had the frequency of 4 (11.8%).
In terms of the student performance prediction accuracy, only seven review studies stated EDM techniques or algorithms that had such a prediction accuracy. DT emerged as the EDM algorithm that had the highest student performance prediction accuracy rate as mentioned by 4 of the 7 studies, while Naïve Bayes had mixed prediction accuracy rates. That is, it was reported as having a high prediction accuracy rate by 2 studies, whereas 2 other studies ranked it as having a low prediction accuracy rate. In Ganesh and Christy's [27] survey, DT generated the most consistent prediction results as opposed to Naïve Bayes, J48 and JRip. The similar precision results were reported by Agrusti et al. [21] in their systematic review of 73 articles. In contrast, ANN was found to have the highest prediction accuracy for student performance as compared to DT in Shahiri et al. [5] review of 30 paper.
Lastly, pertaining to EDM software tools for predicting SAP, WEKA emerged as the most commonly employed tool, followed by both SPSS and RapidMiner. WEKA was similarly found to be an EDM software tool used by 15 of the 20 papers (even though in one instance it was used in tandem with RapidMiner), while both RapidMiner and Matlab were each used by 3 papers in Kumar et al.'s [53] review. In the same breath, WEKA appeared in 14 articles, followed by SPSS (n =9) and R (n = 8) and RapidMiner (n = 5) in Agrusti et al.'s [21] in systematic review of 73 articles.

Conclusions, Limitations and Further Research
As stated earlier, the purpose of this overview was to compare and synthesise the findings of review studies conducted on predicting SAP in higher education using EDM methods, algorithms and tools from 2013 to June 2020. The overview had six research questions, and its specific focus was on review studies in higher education. Even though the 26 review studies spanned diverse countries of origin, India emerged as the country with the most review studies. For subject areas, both maths and science were cited by the review studies that mentioned their fields of study; they were followed by computer science and engineering that featured in 2 review studies. By contrast, in other review studies such as Ifenthaler and Yau's [51] study, computer science and engineering feature as the top two subject areas. Concerning sample size, only 2 studies explicitly stated their precise sample sizes, of which the total number was 44,739.
With regard to purposes, 16 review studies had purposes related to either EDM techniques, EDM methods, EDM models, or EDM algorithms employed to predict SAP and student success in the higher education sector. Six studies had purposes that reviewed or surveyed EDM techniques and tools, while the purposes of three studies focused on student dropout prediction. One study did not mention its purpose.
There are six commonly used typologies of input variables that were reported by the 26 review studies, of which student demographics was the most commonly utilised variable for predicting SAP, with gender and age as the two most common attributes within this typology. Both university academic factors and psychological factors emerged as the next two commonly used typologies of input variables, consecutively. For each, course assessment scores and surveys were the most commonly used attributes.
Among the EDM methods used for predicting SAP, three emerged as the most commonly used: classification, clustering, association rule and regression (the last two were tied at the third spot). Naïve Bayes was the least utilised method. Of the seven commonly used EDM algorithms identified by the 26 review studies for predicting SAP, DT was the most commonly employed, followed by both Artificial Neural Network (ANN) and Support Vector Machine (SVM), respectively, with Naïve Bayes (NB) as the least used algorithm. Nevertheless, as a cluster of algorithms, Bayesian classifiers were the most predominantly used. Moreover, DT was an EDM algorithm that was reported as having the highest prediction accuracy rate for predicting SAP, with Naïve Bayes having a mixed prediction precision rate. With respect to EDM software tools, WEKA was the most commonly utilised tool, followed by both SPSS and RapidMiner.
Finally, it is critical that future reviews on predicting SAP using EDM methods, algorithms and tools should avoid the pitfalls identified above and those highlighted elsewhere in this overview. Most importantly, more overview studies are needed to build on the current overview study with a view to comparing and synthesising the different aspects of existing and future review studies focusing on predicting SAP using EDM methods, algorithms and tools.
Author Contributions: the author [CC] has solely contributed to this paper.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.