1. Introduction and Preliminaries
This study examines the impact of prior English learning experience on students’ academic success in university-level English courses. It draws on data from a survey completed by first-year students enrolled in Computer Sciences and related degree programs (majors) at FMI before their English course.
In this article, we focus on how students rate their own English skills at the beginning of the course and whether these perceptions are confirmed by their actual test performance. We also investigate which background factors – such as previous exposure to English inside and outside school – are most closely related to success in the course.
To understand what best predicts achievement, we apply a high-performance machine learning (ML) method, CART Ensemble and Bagging (CART Ebag), to both the survey responses and the students’ language test results. The goal is to gain insight into how different types of prior experience influence performance and, in turn, inform more responsive teaching strategies tailored to students in STEAME-related fields. This study builds upon previous research that explores how social, economic, and motivational factors contribute to students’ success in learning English as a foreign language. Several recent studies have examined the impact of learner attitudes, beliefs, and emotions on English language outcomes in diverse contexts. Previous research has examined similar issues, including work in [
6,
8,
9,
10,
11,
13,
16,
17,
18,
21].
One such investigation was conducted by El-Omari, in [
14], who surveyed 496 secondary school students in Jordan using a 16-item yes/no questionnaire. His study found that students who reported more positive attitudes toward English, greater informal exposure to the language, and stronger parental support achieved higher academic results. Factors such as higher family income, having a quiet study space, and access to English media (including TV, newspapers, and dictionaries) were also associated with better performance. While El-Omari focused on secondary-school learners and relied on a basic yes/no questionnaire format, our study targets university-level bachelor’s degree students. It employs a more detailed survey including Likert-scale statements and open-ended questions. Additionally, we employ regression analysis to investigate which background variables, such as type of school, years of prior study, or self-assessed skills, can predict academic achievement in English. In doing so, we aim to extend previous findings to a new learner profile and educational context by focusing on students in a university setting.
A recent study, in [
15], further underscores the intricate relationship between learners’ experiences and their motivation to learn English. Focusing on Japanese elementary school students, the authors examined how external influences such as family, school environment, and exposure to English-language media shape what is known as the ”ideal L2 self“, or learners’ future-oriented vision of themselves as competent English users. Using hierarchical regression analysis, they found that while past exposure through cram schools and travel abroad initially appeared significant, these factors were outweighed by the ongoing influence of parents, teachers, and media. The study confirms that the surrounding social and educational environment plays a critical role in sustaining students’ motivation to learn English, suggesting that long-term success may depend less on isolated experiences and more on consistent support from key figures and resources in learners’ daily lives.
Another relevant study in [
19] examined how students’ beliefs about learning English affect their achievement, both directly and indirectly, through emotional factors. The researchers surveyed 440 Ethiopian university students and found that learners who held more sophisticated beliefs, such as confidence in their own efforts, openness to risk-taking, and a realistic view of English difficulty, performed better on English tests. These beliefs were also linked to reduced levels of anxiety and embarrassment and, in some cases, increased enjoyment. Using structural equation modelling, the study confirmed that emotional factors mediate the impact of learning beliefs on academic performance. The findings highlight the importance of promoting both advanced language learning beliefs and supportive emotional experiences to enhance students’ success in English, particularly in STEAME-related fields.
Liu and Li in [
18] also examined how internal learner factors, specifically classroom anxiety and motivation, affect English achievement. Surveying over 570 Chinese university students, they found that higher levels of English classroom anxiety were significantly associated with lower test scores and self-rated proficiency. At the same time, both intrinsic and extrinsic motivation were positively associated with better outcomes. Interestingly, the study also showed that students’ motivation and anxiety were inversely related, suggesting that reducing anxiety may boost motivation and vice versa. Drawing on self-determination theory, the authors recommend addressing emotional barriers and fostering motivational support as key strategies for improving performance in foreign language learning.
[
8] examine how fuzzy logic can provide a fairer assessment of students’ mathematical knowledge by integrating written and oral grades with standardized test results. Using data from
Italian high school students who also took the national INVALSI (the Italian National Institute for the Evaluation of the Education and Training System) math exam, they applied two defuzzification methods – centre of gravity and mean of maximum – to generate final grades. Both methods yielded lower average scores than traditional teacher assessments, with the centre of gravity approach producing the most conservative results.
Building on the idea of alternative evaluation methods, [
6] proposed an intelligent, computer-based testing system tailored for humanities students, who often provide narrative answers to open-ended questions. The system employs the shingle algorithm, together with stemming and MD5 hashing, to compare student answers against an ideal reference response. It measures the degree of matching (S), completeness (P), and overall effectiveness using the F-measure (F). Tested on a sample of 120 humanities students aged
, the system achieved an optimal processing time (t) of approximately
and an F-measure of
, demonstrating a balance of accuracy and efficiency.
[
9] propose a fuzzy logic model that integrates students’ school grades (written and oral) with their results on the national INVALSI mathematics assessment to provide a more balanced evaluation of knowledge. Using data from more than
Italian students in grades 8, 10, and 13 during the 2018 – 2019 school year, the authors applied fuzzification, inference rules, and defuzzification to generate "hypothetical grades." The analysis reveals that these fuzzy logic-based grades are consistently and significantly lower than traditional teacher-assigned grades across all levels, indicating that the model mitigates grade inflation and teacher bias.
Similarly, applying fuzzy logic, [
16] focused on predicting students’ academic performance based on perceptions of instructors. Using a Mamdani fuzzy inference model with survey data from 1,250 students, they evaluated the impact of trust, perception, and usefulness of instructors. The results indicated that trust and usefulness were strongly correlated with end-of-semester outcomes, demonstrating that fuzzy logic can effectively capture subjective perceptions to predict academic success.
The influence of contextual factors on fuzzy assessment is explored by [
10]. Using hierarchical linear regression, they analyzed demographic variables, including gender, school type, and socioeconomic background. The study revealed that while fuzzy grading offers a more flexible evaluation framework, demographic differences continue to be significant predictors of academic outcomes.
Finally, [
11] investigated the predictive power of machine learning, applying Random Forest regression to student performance on the INVALSI mathematics assessment. Combining traditional school grades with fuzzy-based grades, they compared linear models and Random Forest predictions. Their findings demonstrated that Random Forest regression improved predictive accuracy and highlighted the added value of fuzzy grades, which offer a more objective representation of student knowledge than teacher-assigned marks alone.
The present study aims to investigate the impact of key factors on knowledge acquisition in English language learning among students in computer science majors. At the same time, we expand the predictor model by including variables with more than binary values (yes/no) and increasing their total number to twelve. The relative influence of these predictors is assessed using the CART Ebag method. To our knowledge, such techniques have not been systematically applied in the context of English language education in technical majors. By introducing data-driven methods into language education research, this work expands the analytical toolkit available for exploring predictors of academic performance.
The findings aim to inform more responsive teaching strategies, tailored to the needs of students in STEAME-related fields, and to extend previous research that has examined how social, educational, and motivational factors affect English learning outcomes.
1.1. English Language Education at FMI
At FMI, English is a compulsory part of first-year studies. Its relevance is obvious: in the IT field, terminology, documentation, and professional communication are primarily conducted in English. Despite this, many students in mathematics and IT-related majors view English more as a tool than an academic subject. Because many have studied English in secondary school or independently, they often assume their language skills are already sufficient and not directly linked to their academic success. As a result, student motivation can vary considerably and often depends on their self-perceived competence and prior experience.
This study aims to identify which aspects of that prior experience best predict success in English at university.
Most students take English only during their first academic year: General English in the first semester and English for Specific Purposes (ESP) in the second. An exception is students majoring in Business Information Technology, who also take a course in Business English in their final year. At the start of the academic year, all students complete a placement test, and based on the results, they are placed in language groups of approximately 20 students with similar proficiency levels.
The General English course develops both language skills and cultural knowledge. Students complete tasks such as paraphrasing, summarising, proofreading, and note-taking. Broader academic topics, such as plagiarism, are also addressed. Cultural aspects are explored through discussions of stereotypes, idioms, gestures, and other nuances.
Students work individually or in small groups, submitting tasks through the Classroom platform or presenting them in class. In the first semester, students collaborate on a team project that includes building a website, giving a presentation, and creating a test.
In the English for Specific Purposes course in the second semester, the emphasis shifts to technical vocabulary and soft skills. Project work becomes individual, with students researching and presenting on technology-related topics.
Each week, students complete homework assignments, which may include recorded presentations, audio files, written tasks, or quizzes. These are uploaded to the Classroom platform before a set deadline. In addition to a midterm test, students take a final test covering the material studied during the course. Assessment is continuous and based on:
Classwork (attendance and participation):
Homework and project work:
Final in-class test:
1.2. Participants and Data Collection
The study involved a random sample of 61 first-year students majoring in Computer Science (focused on programming), Business Information Technology (focused on the use of information technology in business), and Software Technology and Design (focused on the application of software products). The data included scores from the placement test and grades from various components of the course, including classwork, homework, project work, and final tests from both semesters, as well as self-assessments of knowledge and skills.
The first survey was conducted before the General English course (in the 2024/2025 academic year). It included closed-ended questions on gender, age, background, years of studying English, and type of secondary education. Other questions explored the intensity of school-based English learning, possession of English language certificates, and experiences using English abroad (e.g., reason for travel, duration, language challenges). Students were also asked about how frequently they practice English, their preferred activities, and their comfort level in participating in discussions in English. A final section used Likert-scale items to assess students’ views on the cultural aspects of language learning and asked them to identify their strongest and weakest language skills.
The second survey, conducted after the first semester, was shorter in length. Students were asked to self-assess their language progress on a scale ranging from ”significant improvement“ to ”slight worsening“, and to explain their response. They also suggested what might motivate them to put more effort into the English course in the second semester and offered recommendations for course improvement. The dataset also includes the results from the placement test and the final grades in General English and English for Specific Purposes (GRADE 1 and GRADE 2).