1. Introduction
Effective feedback is central to the development of academic writing proficiency and learner autonomy in second and foreign language (L2) education. In foreign language learning contexts such as Türkiye, where large class sizes and limited instructional time constrain opportunities for individualized instructor feedback, the need for sustainable, equitable, and scalable feedback practices has become increasingly evident [
1,
2]. These contextual challenges have intensified interest in technology-enhanced feedback solutions capable of supporting writing development while reducing instructional burden.
Recent advances in artificial intelligence (AI) have introduced new possibilities for addressing such structural constraints through automated and generative feedback systems, fundamentally reshaping how feedback is delivered, interpreted, and utilized in higher education. Among AI-mediated feedback tools, automated feedback (AF) systems such as Grammarly and generative AI feedback (GenAI-F) tools such as ChatGPT represent two distinct approaches with differing pedagogical affordances. AF systems primarily target form-focused aspects of writing, including grammar, vocabulary, and mechanics, relying on predefined algorithms and scoring criteria [
3]. Research conducted in L2 contexts, including Türkiye, suggests that AF can support linguistic accuracy and reduce surface-level errors; however, its capacity to address higher-order concerns such as content development, organization, and argumentation remains limited [
4]. Moreover, the effectiveness of AF appears to be mediated by learners’ language proficiency and their ability to interpret automated corrections meaningfully.
In contrast, recent developments in generative AI have expanded the scope of AI-mediated feedback by enabling context-sensitive, discursive, and explanatory responses. Unlike conventional AF systems, GenAI-F tools can provide feedback on content, coherence, organization, and rhetorical effectiveness while engaging learners in interactive and reflective dialogue [
5,
6]. Emerging empirical evidence indicates that GenAI-F can facilitate idea generation, encourage deeper revision practices, and promote learner autonomy in academic writing [
7,
8]. Within the Turkish L2 context, where students often rely heavily on instructor authority and model texts, such dialogic feedback may offer more inclusive and sustainable writing support by fostering independent learning pathways.
However, feedback effectiveness cannot be examined independently of learner-related factors, particularly language proficiency. Previous research consistently demonstrates that learners’ uptake and use of feedback are mediated by their linguistic and cognitive resources [
9,
10]. Studies conducted in Turkish higher education contexts similarly report that learners with lower proficiency levels often struggle to interpret indirect or metalinguistic feedback, limiting its impact on revision quality [
11]. Despite growing interest in AI-mediated feedback, empirical evidence remains limited regarding how different feedback types interact with proficiency levels to influence writing development in Türkiye.
Beyond immediate writing performance, feedback also plays a critical role in fostering self-regulated learning (SRL), a cornerstone of sustainable and lifelong education. Grounded in sociocognitive theory, SRL refers to learners’ proactive capacity to plan, monitor, and evaluate their learning processes through the dynamic interaction of cognition, behavior, and learning environments [
12,
13]. In L2 writing, SRL is particularly important given the sustained cognitive effort, strategic decision making, and emotional regulation required throughout the writing process.
Research conceptualizes SRL in L2 writing as a multidimensional construct encompassing cognitive, metacognitive, motivational, and social strategies [
14]. Cognitive strategies support text generation and linguistic processing, while metacognitive strategies enable goal setting, planning, and self-monitoring. Motivational regulation strategies help sustain effort and engagement, and social strategies facilitate help-seeking and interaction with feedback sources. Empirical studies consistently demonstrate that learners who actively employ a wide range of SRL strategies produce higher-quality writing and exhibit greater autonomy in academic tasks [
15,
16]. From a sustainability perspective, SRL represents a transferable competence that supports learner agency beyond formal educational settings.
Feedback functions as a central catalyst for the development of SRL by providing information about performance relative to learning goals and enabling learners to regulate subsequent learning behaviors [
17]. Within SRL frameworks, feedback operates as part of an iterative feedback loop in which learners interpret external input, generate internal feedback, and adjust strategies accordingly [
13]. In L2 writing, effective feedback supports not only error correction but also reflection, self-evaluation, and strategic engagement with writing tasks [
18]. While instructor-mediated feedback has been shown to promote SRL development, practical constraints in higher education limit the sustainability of individualized feedback practices, particularly in large classes.
AI-mediated feedback has therefore attracted increasing attention as a scalable alternative capable of supporting continuous learner engagement. Automated feedback systems have been shown to enhance linguistic accuracy and writing fluency, particularly in form-focused dimensions [
19,
20], yet they often provide limited support for meaning-level revisions or individualized strategy development [
21]. Generative AI feedback tools, by contrast, offer interactive and adaptive feedback that may better support metacognitive awareness, reflection, and strategic writing behaviors [
5,
8]. Despite this potential, comparative research examining how AF and GenAI-F differentially influence writing performance and SRL strategy use remains scarce, particularly in non-Western higher education contexts.
Furthermore, the moderating role of language proficiency in AI-mediated feedback contexts has received insufficient empirical attention. While higher-proficiency learners tend to engage more selectively and critically with feedback, integrating it into their existing linguistic repertoires, lower-proficiency learners may experience cognitive overload when confronted with extensive corrective input [
9,
22]. Preliminary evidence suggests that GenAI-F may offer adaptive scaffolding through explanation and dialogue, potentially mitigating proficiency-related barriers [
6]. Yet, this assumption remains underexplored in Turkish L2 writing contexts.
Addressing these gaps, the present study investigates the effects of two forms of AI-mediated feedback, automated feedback and generative AI feedback, on Turkish L2 learners’ writing performance and use of self-regulated learning strategies, while also examining the moderating role of English language proficiency as operationalized through CEFR levels. Situated within a Turkish state university context, this research aims to contribute to ongoing discussions on sustainable, technology-enhanced writing pedagogy and to inform pedagogical and policy-oriented decision making in support of inclusive and high-quality language education.
Specifically, the study addresses the following research questions:
1. To what extent do automated feedback (AF), generative AI feedback (GenAI-F), and instructor-mediated feedback (IMF) differ in their effects on Turkish L2 learners’ use of self-regulated learning strategies in academic writing?
2. To what extent do automated feedback (AF), generative AI feedback (GenAI-F), and instructor-mediated feedback (IMF) differ in their effects on Turkish L2 learners’ writing performance, both overall and across key writing subcomponents?
3. Does English language proficiency, operationalized using CEFR levels, moderate the effects of feedback type on Turkish L2 learners’ writing development?
2. Materials and Methods
The study was conducted in a credit-bearing compulsory English course offered to preparatory school undergraduate students from non-English majors at a public state university in Türkiye. The course aimed to develop students’ integrated academic English skills, including reading, listening, speaking, and writing, in alignment with the learning outcomes specified by the Council of Higher Education (YÖK). Instruction was delivered in two 90-minute sessions per week over a 14-week semester.
Similar to many Turkish higher education contexts, the course was characterized by large class sizes and intensive curricular demands, which limited instructors’ ability to provide individualized, process-oriented feedback on student writing. Consequently, writing tasks were typically evaluated using holistic rubrics aligned with CEFR descriptors, with limited formative commentary, a situation that reflects a broader need for sustainable and scalable feedback practices in Turkish L2 instruction [
2,
23].
Participants were recruited through convenience sampling from three intact classes taught by the same instructor to minimize instructional variability. A total of 86 students (58 females, 28 males) voluntarily participated in the study. The classes were randomly assigned to one of three conditions: a IMF group receiving no AI-mediated feedback (n = 27), an AF feedback group using Grammarly (n = 29), and a GenAI-F group using ChatGPT (n = 30). Participants’ ages ranged from 18 to 22 years (M = 18.9, SD = 0.74). All participants had studied English for approximately 11 years in the Turkish education system and reported no long-term overseas study experience.
English proficiency was measured using the Standard English Test (SET) and interpreted according to CEFR benchmarks. Based on their scores, students were categorized into intermediate-level learners (B1–B2; n = 40) and advanced-level learners (B2–C1; n = 44). Two students did not report valid scores and were excluded from proficiency-based analyses. One-way ANOVA confirmed no statistically significant differences in proficiency across the three groups at baseline (p > .05). A chi-square test further indicated that proficiency level was not significantly associated with group assignment, supporting the homogeneity of the groups.
Prior to participation, all students were informed about the research purpose, procedures, voluntary nature of participation, and their right to withdraw at any stage without academic penalty. Written informed consent was obtained from all participants. Ethical approval was granted by Ankara Hacı Bayram Veli University’s Ethics Committee, and all data were anonymized in accordance with institutional and national research ethics guidelines.
Data were collected over a ten-week period during the academic semester. All participants completed three argumentative writing tasks integrated into the course syllabus, with approximately three-week intervals between tasks. For each task, students wrote a 150–180-word essay within 30 minutes under controlled classroom conditions and without access to external resources. Writing prompts were aligned with course themes and instructional objectives.
Students in the instructor-mediated feedback (IMF) group completed the writing tasks and received only holistic scores based on a CEFR-aligned writing rubric. They did not receive formative feedback and were not required to revise their drafts. Students in the automated feedback (AF) group revised their drafts using feedback generated by Grammarly, which provides algorithm-based feedback on grammar, vocabulary use, clarity, and mechanics. Students in the generative AI feedback (GenAI-F) group revised their drafts using ChatGPT, which was employed solely as a feedback provider. These students were instructed to submit their drafts and request rubric-based feedback addressing content, organization, language use, and coherence.
To ensure ethical and pedagogically appropriate use of AI tools, students in both experimental groups participated in structured 20-minute training sessions prior to each writing task. The training focused on understanding the scope and limitations of AI-mediated feedback, generating rubric-aligned prompts, critically evaluating feedback suggestions, and integrating feedback selectively. Sample prompts were provided, and students were encouraged to adapt them as needed. To ensure equity, students in the IMF group were granted access to the same training materials and AI tools after the completion of the study. Three instruments were used for data collection.
To assess changes in writing performance, all participants completed a pre-test and a post-test argumentative writing task under examination conditions. Each test required a minimum of 150 words within 30 minutes. Essays were evaluated using the L2 Composition Profile [
24], which assesses content, organization, vocabulary, language use, and mechanics. Initial scoring was conducted using ChatGPT-4 with rubric-embedded prompts to ensure consistency. An experienced L2 writing instructor independently rated a randomly selected 30% of the essays. Interrater reliability between AI-assisted and human ratings was satisfactory, with an intraclass correlation coefficient of 0.81.
Students’ use of self-regulated learning strategies was measured using the Writing Strategies for Self-Regulated Learning Questionnaire (WSSRLQ) [
14]. The 35-item instrument assessed four dimensions: cognitive, metacognitive, motivational regulation, and social-behavioral strategies. The questionnaire was translated into Turkish and back-translated to ensure linguistic equivalence. Responses were recorded on a seven-point Likert scale. Cronbach’s alpha values ranged from 0.75 to 0.84 across subscales, indicating satisfactory reliability.
English language proficiency was assessed using the EF Standard English Test, a validated online assessment aligned with CEFR descriptors [
25]. Test scores were used to classify participants into proficiency groups for moderation analyses. At the beginning of the semester, participants completed the proficiency test, the pre-test writing task, and the SRL questionnaire. During the intervention, students completed three writing tasks under their assigned feedback conditions. At the end of the intervention period, all participants completed the post-test writing task and the SRL questionnaire within the same week.
All quantitative analyses were conducted using SPSS 23. Preliminary checks confirmed normality, homogeneity of variance, and equivalence of covariance matrices. To address RQ1 and RQ2, mixed-design ANOVAs were conducted to examine the effects of feedback type and time on SRL strategies and writing performance. To address RQ3, two-way ANOVA was used to examine the interaction between feedback type and proficiency level. Effect sizes were reported using partial eta squared and Cohen’s d, with Bonferroni corrections applied for multiple comparisons.
3. Results
The mixed factorial ANOVAs revealed statistically significant Time × Feedback Condition interaction effects for selected self-regulated learning (SRL) strategies. Although these interaction effects indicate that changes over time varied as a function of feedback condition, they do not specify the source or direction of these differences. To clarify how SRL strategy gains differed across groups, follow-up post-hoc analyses were conducted.
Bonferroni-adjusted pairwise comparisons were performed on pre-to-post gain scores for SRL strategies that showed significant interaction effects. These analyses compared the generative AI feedback group (GenAI-F), the automated feedback group (AF), and the instructor-mediated feedback group (IMF). This approach enabled a precise examination of whether observed gains could be attributed to specific feedback conditions rather than to general practice or time effects.
The Bonferroni correction was applied to control for inflated Type I error resulting from multiple pairwise comparisons. Given the multidimensional nature of SRL and the number of strategies examined, a conservative adjustment was necessary to enhance the reliability and replicability of the findings. The Bonferroni procedure was selected due to its transparency and common use in experimental educational research [
26].
In addition to adjusted significance levels, effect sizes were reported to indicate the practical magnitude of group differences. Reporting both adjusted p values and effect sizes supports a more meaningful interpretation of educational impact, which is particularly important in sustainability-oriented research that emphasizes durable and transferable learning outcomes.
Table 1.
Bonferroni-adjusted pairwise comparisons of SRL strategy gain scores.
Table 1.
Bonferroni-adjusted pairwise comparisons of SRL strategy gain scores.
| SRL Strategy |
Group Comparison |
Mean Difference (Δ) |
SE |
p (Adj.) |
Cohen’s d |
| Idea planning |
GenAI-F – AF |
0.48 |
0.14 |
.004 |
0.71 |
| |
GenAI-F – IMF |
0.76 |
0.16 |
< .001 |
1.02 |
| |
AF – IMF |
0.28 |
0.15 |
.184 |
0.39 |
| Goal-oriented monitoring & evaluation |
GenAI-F – AF |
0.42 |
0.13 |
.007 |
0.65 |
| |
GenAI – IMF |
0.69 |
0.15 |
< .001 |
0.94 |
| |
AF – IMF |
0.27 |
0.14 |
.163 |
0.38 |
| Interest enhancement |
GenAI-F – AF |
0.36 |
0.12 |
.011 |
0.59 |
| |
GenAI-F – IMF |
0.58 |
0.14 |
.002 |
0.82 |
| |
AF – IMF |
0.22 |
0.13 |
.241 |
0.33 |
Bonferroni-adjusted post-hoc results indicated that learners in the GenAI-F condition achieved significantly greater gains in idea planning, goal-oriented monitoring and evaluation, and interest enhancement than learners in both the AF and IMF conditions. No statistically significant differences were observed between the AF and IMF groups. Effect sizes ranged from medium to large, indicating a meaningful advantage of generative AI feedback in fostering metacognitive and motivational dimensions of SRL.
Prior to the intervention, descriptive statistics showed that the three groups were comparable in their reported use of SRL strategies and overall writing performance. Pre-test mean scores for overall writing were similar across the IMF group (M = 75.18, SD = 5.62), the AF group (M = 75.41, SD = 4.38), and the GenAI-F group (M = 75.66, SD = 3.95). Comparable patterns were also observed across the five writing subcomponents: content, organization, vocabulary, language use, and mechanics.
To confirm group equivalence statistically, one-way ANOVAs were conducted on the nine SRL strategy dimensions, overall writing scores, and writing sub-scores. No significant differences were found among the three groups on any pre-test measure (p > .05). These findings indicate that participants entered the study with comparable levels of SRL strategy use and writing proficiency, providing a robust foundation for interpreting the effects of feedback condition in subsequent analyses.
To address the first research question, mixed factorial analyses of variance were conducted for the nine dimensions of self-regulated learning (SRL) strategies. Time (pre-test vs. post-test) was treated as a within-subject factor, and feedback type (instructor-mediated feedback [IMF], automated feedback [AF], and generative AI feedback [GenAI-F] was included as a between-subject factor.
As shown in
Table 2, statistically significant main effects of time were observed for several cognitive and metacognitive strategies, including text processing, knowledge rehearsal, idea planning, goal-oriented monitoring and evaluation, and feedback handling (p < .01). The associated partial eta squared values ranged from .17 to .53, indicating medium to large effects. These findings suggest that sustained engagement in academic writing tasks over the semester supported overall growth in SRL strategy use across all instructional conditions.
Beyond these general developmental trends, significant Time by Feedback Type interaction effects were identified for goal-oriented monitoring and evaluation and for interest enhancement. The interaction effect for goal-oriented monitoring and evaluation explained approximately 13 percent of the variance, indicating that changes in learners’ metacognitive regulation differed as a function of feedback condition. A comparable interaction effect was observed for interest enhancement, accounting for about 12 percent of the variance. This pattern suggests that learners’ motivational engagement with writing tasks evolved differently depending on the type of feedback they received.
In contrast, no significant interaction effects were found for cognitive strategies such as text processing and knowledge rehearsal, nor for social-behavioral strategies such as peer learning and feedback handling. Similarly, motivational self-talk and emotional control did not show differential change across feedback conditions. Taken together, these results indicate that while engagement in writing instruction generally promoted SRL development, generative AI feedback exerted a distinctive influence on specific metacognitive and motivational regulation strategies that are central to sustained and autonomous learning.
To further interpret the significant Time by Feedback Type interaction effects, Bonferroni-adjusted simple effects analyses were conducted on post-test SRL gain scores. These analyses aimed to clarify the direction and magnitude of between-group differences in strategies that exhibited differential developmental trajectories over time.
As reported in
Table 3, statistically significant between-group differences emerged for goal-oriented monitoring and evaluation and for interest enhancement. Learners in the GenAI-F condition demonstrated significantly greater gains in interest enhancement than those receiving AF feedback, indicating stronger motivational engagement when feedback was dialogic and explanatory. In contrast, learners in the IMF condition showed significantly greater improvement in goal-oriented monitoring and evaluation than those in the AF group. This pattern may reflect Turkish L2 learners’ increased reliance on self-monitoring and evaluative strategies in instructional contexts where automated feedback is absent and accountability for performance is internalized, particularly within exam-oriented higher education settings.
No statistically significant group differences were observed for the remaining SRL strategies, suggesting that the differential impact of feedback type was selective rather than uniform across SRL dimensions.
Within-group comparisons further clarified the developmental patterns associated with each feedback condition. Paired-samples t-tests were conducted to examine pre-test to post-test changes in SRL strategies for each group, as presented in
Table 4.
Students in the AF group demonstrated significant gains in text processing, knowledge rehearsal, and idea planning, with medium to large effect sizes. However, no significant changes were observed in motivational regulation strategies, suggesting that form-focused automated feedback primarily supported lower-level cognitive regulation rather than motivational engagement.
The GenAI-F group exhibited significant gains in text processing, knowledge rehearsal, and idea planning, with the largest effect observed for idea planning. This finding highlights the strong planning-oriented affordances of generative AI feedback. In addition, students in this group reported a significant increase in interest enhancement, indicating improved motivational engagement with writing tasks. At the same time, small but significant declines were observed in feedback handling and emotional control. These changes may reflect increased reliance on AI-generated guidance or heightened cognitive and affective demands associated with interacting with generative systems.
The IMF group showed moderate to large improvements in several cognitive and metacognitive strategies, including idea planning and goal-oriented monitoring and evaluation. However, a significant decline in feedback handling was also observed, underscoring the constraints faced by learners when opportunities for individualized feedback are limited.
Taken together, these within-group findings indicate that GenAI-F exerted the strongest influence on planning-related and motivational SRL strategies, whereas AF feedback primarily supported surface-level cognitive regulation. Instructor-mediated feedback promoted metacognitive monitoring but offered more limited support for sustained motivational engagement.
To examine the effects of feedback type on learners’ writing performance, a series of 2 × 3 mixed factorial analyses of variance were conducted, with time (pre-test vs. post-test) as the within-subject factor and feedback type (IMF, AF, GenAI-F) as the between-subject factor. Analyses were performed for overall writing scores and for five writing subcomponents, as summarized in
Table 5.
The analyses revealed significant main effects of time on overall writing performance as well as on content, vocabulary, and mechanics (p < .001), with medium to large effect sizes. These findings indicate that students’ writing performance improved over the semester across all instructional conditions, reflecting cumulative learning effects associated with sustained writing practice.
A significant main effect of feedback type was observed for content scores, suggesting that different feedback modalities had differential impacts on idea development and elaboration. More importantly, statistically significant Time by Feedback Type interaction effects emerged for overall writing scores and content. These interaction effects accounted for 13 percent and 11 percent of the variance, respectively, indicating that the magnitude of writing improvement differed meaningfully across feedback conditions.
Follow-up simple effects analyses demonstrated that, at the post-test, learners in the GenAI-F condition outperformed those in the IMF condition on overall writing scores and mechanics. In addition, the GenAI-F group achieved significantly higher overall and content scores than the AF group. These findings point to the added value of generative AI feedback in supporting higher-order writing processes, particularly content development and argument elaboration, which are areas of persistent difficulty for Turkish L2 learners due to limited opportunities for extended and iterative writing.
Within-group analyses further revealed differentiated patterns of development. The AF group demonstrated significant gains in content and mechanics, with particularly strong effects for content, indicating the effectiveness of automated feedback for improving surface-level accuracy and clarity. The GenAI-F group showed significant improvements in overall writing performance as well as in content, vocabulary, and mechanics, with consistently large effect sizes across components. In contrast, the IMF group exhibited more modest gains, primarily in content and language-related sub-scores, consistent with general instructional effects observed in teacher-led writing contexts.
Overall, these findings suggest that while all feedback conditions contributed to writing development over time, generative AI feedback was associated with broader and more substantial gains across multiple dimensions of writing performance, highlighting its potential role in supporting sustainable and scalable writing instruction in higher education.
To examine whether L2 proficiency moderated the effects of feedback type on writing development, a two-way analysis of variance was conducted on overall writing gain scores. The results revealed a statistically significant main effect of feedback type, as well as a significant interaction between feedback type and L2 proficiency level. These findings indicate that the effectiveness of feedback varied depending on learners’ proficiency.
Simple effects analyses showed that within the AF condition, advanced-proficiency learners achieved significantly greater writing gains than intermediate-proficiency learners. This pattern suggests that learners with higher linguistic competence were better able to interpret and apply form-focused automated feedback. In contrast, within the GenAI-F condition, intermediate-proficiency learners demonstrated significantly greater writing gains than their counterparts in the AF condition. This finding highlights the scaffolding role of generative AI feedback, which appears particularly beneficial for Turkish L2 learners who possess foundational language knowledge but require support with idea generation, organization, and elaboration.
Taken together, these results indicate that GenAI-F is especially effective for intermediate-level learners, whereas AF feedback appears more suitable for advanced learners. This proficiency-sensitive pattern underscores the importance of aligning feedback technologies with learner characteristics in sustainable and inclusive L2 writing instruction.
To further clarify the nature of the significant interaction effects identified in the mixed factorial analyses, Bonferroni-adjusted post-hoc comparisons were conducted on post-test writing scores across the three feedback conditions. These analyses aimed to identify specific group differences in overall writing performance and its analytic subcomponents. The results are presented in
Table 6.
The post-hoc results indicate that learners in the GenAI-F condition outperformed those in the IMF condition on overall writing performance and content. In addition, GenAI-F learners achieved significantly higher overall and content scores than those in the AF condition. For mechanics, both AF and GenAI-F groups demonstrated advantages over the IMF group, suggesting that technology-mediated feedback supported surface-level accuracy more effectively than instructor-mediated feedback alone.
To examine how writing performance developed over time within each instructional condition, paired-samples t-tests were conducted for overall writing scores and the five analytic subcomponents. The results are presented in
Table 7.
Viewed through the lens of sustainability and lifelong learning, these findings highlight feedback-mediated self-regulated learning as a central mechanism for developing durable writing competence beyond short-term performance gains. Learners in the GenAI-F condition, who demonstrated marked growth in metacognitive regulation, particularly idea planning and goal-oriented monitoring, alongside increased motivational engagement, also achieved the most substantial and transferable improvements in writing quality, especially in content development.
These gains suggest that generative AI feedback can support learners in developing the capacity to independently plan, evaluate, and refine their writing over time. Such competencies are essential for sustainable language learning in digitally mediated higher education contexts. In contrast, improvements in the AF condition were largely confined to mechanics and form-focused accuracy, reflecting gains in procedural efficiency rather than broader self-regulatory control. While these outcomes remain pedagogically valuable, their limited impact on higher-order SRL dimensions may restrict learners’ long-term adaptability as autonomous writers.
The comparatively modest progress observed in the IMF condition further underscores the importance of feedback practices that actively foster learners’ self-regulation, motivation, and agency. Collectively, the results suggest that AI-empowered feedback systems designed to cultivate SRL strategies can contribute to more sustainable models of L2 writing instruction by equipping learners with self-directed skills essential for continuous learning across academic, professional, and lifelong contexts.
To further examine whether learners’ L2 proficiency moderated the effectiveness of different feedback types, a two-way analysis of variance was conducted with feedback type (instructor-mediated feedback, automated feedback, and generative AI feedback) and CEFR-based proficiency level (intermediate vs. advanced) as between-subject factors, and overall writing gain scores as the dependent variable.
Figure 1 illustrates the interaction pattern by comparing post–pre writing gains across feedback conditions for learners at different CEFR levels.
As illustrated in
Figure 1, a clear interaction pattern emerged between feedback type and CEFR proficiency level. For advanced learners (CEFR C1), both AI-supported conditions led to greater writing gains than the no-feedback condition, with AF feedback yielding particularly strong improvements. In contrast, intermediate learners (CEFR B1–B2) benefited most from GenAI-F, showing substantially larger gains than their counterparts in the AF and IMF groups. This divergence suggests that GenAI-F may provide more adaptive scaffolding for learners with developing proficiency, while advanced learners are better positioned to capitalize on form-focused AF feedback. Overall, the figure visually substantiates the statistically significant moderation effect reported in the ANOVA results and highlights the importance of aligning AI feedback types with learners’ proficiency levels to support equitable and sustainable writing development in higher education. From a sustainability perspective, these findings suggest that GenAI-F can redistribute cognitive responsibility to learners, fostering durable self-regulation skills rather than short-term performance gains.
4. Discussion
This study examined how different forms of AI-mediated feedback influenced Turkish L2 learners’ self-regulated learning (SRL) strategies and writing development over time, with particular attention to sustainability and learner equity. Across all feedback conditions, significant gains were observed in multiple SRL dimensions, confirming prior research demonstrating that sustained academic writing practice fosters strategic engagement and learner responsibility [
13,
27]. More importantly, significant interaction effects between time and feedback type emerged for specific SRL dimensions, indicating that feedback modality plays a meaningful role in shaping learners’ strategic development.
The interaction effects observed for metacognitive regulation were primarily driven by gains in the idea planning and goal-oriented monitoring subscales. The effect sizes for these dimensions were in the moderate range, suggesting that the improvements were not only statistically significant but also pedagogically meaningful. Learners in the GenAI-F condition demonstrated a greater capacity to generate ideas, articulate writing goals, and align revisions with rhetorical intent over time. These findings are consistent with socio-cognitive models of self-regulated learning, which emphasize explanatory, dialogic, and goal-referenced feedback as key mechanisms supporting metacognitive regulation [
28,
29]. Unlike automated feedback systems that primarily target surface-level linguistic accuracy, generative AI feedback provided adaptive explanations and revision-oriented prompts that encouraged learners to reflect on content development and planning decisions, thereby functioning as an external metacognitive scaffold [
5,
8].
With respect to motivational regulation, the significant interaction effect for interest enhancement was accompanied by a small to moderate effect size. Although smaller in magnitude than the metacognitive gains, this effect is educationally consequential given the cumulative nature of motivation in sustained writing practice. The dialogic and responsive nature of GenAI-F appears to have increased perceived task value and learner engagement, supporting learners’ willingness to invest effort across successive writing cycles. This finding aligns with previous research indicating that feedback which enhances learner autonomy and perceived control contributes to motivational sustainability and long-term engagement [
16,
17]. In the Turkish higher education context, where writing instruction is often exam-driven and instructor-centered, such motivational regulation is particularly important for sustaining learner autonomy.
In contrast, no significant interaction effects were found for the cognitive regulation or social regulation subscales, and the associated effect sizes were negligible. This pattern suggests that these dimensions may be less sensitive to feedback modality alone or may require explicit instructional scaffolding and longer timeframes to develop. Previous research has similarly shown that cognitive and social SRL strategies are strongly influenced by task design and opportunities for peer interaction rather than feedback alone [
9,
23]. Given the predominantly individual and product-oriented nature of academic writing tasks in Turkish universities, opportunities for peer-supported regulation remain limited, which likely constrained the development of social SRL strategies regardless of feedback type.
Writing performance results further reinforced the sustainability value of generative AI feedback. Learners receiving GenAI-F demonstrated the largest overall gains in writing quality, particularly in content development, which showed moderate to large effect sizes. Content development is widely regarded as a key indicator of transferable academic literacy and higher-order writing competence [
30]. In contrast, automated feedback primarily supported mechanics and surface-level accuracy, yielding small to moderate effects in these subcomponents. This pattern aligns with previous findings indicating that automated feedback systems are effective for form-focused revision but offer limited support for meaning-level writing development [
3,
4]. Together, these results highlight the complementary yet distinct pedagogical roles of different AI feedback tools.
The moderation effect of language proficiency provides further insight into equitable AI deployment. Higher-proficiency learners benefited more from automated feedback, likely due to stronger linguistic resources and greater capacity to interpret decontextualized corrective input. In contrast, intermediate-level learners demonstrated greater gains under GenAI-F conditions, suggesting that adaptive explanations and dialogic interaction may mitigate proficiency-related barriers to feedback uptake. This finding is consistent with research showing that learners’ engagement with feedback is mediated by linguistic proficiency and metacognitive resources [
6,
10]. From an equity perspective, generative AI feedback may therefore reduce proficiency-based disparities by offering differentiated support aligned with learners’ needs.
From a sustainability standpoint, the findings suggest that GenAI-F can function as a scalable metacognitive and motivational scaffold that supports strategic development without increasing instructor workload. However, the limited impact of automated feedback on higher-order SRL highlights the importance of embedding AI tools within pedagogical frameworks that explicitly promote reflection, self-evaluation, and learner agency [
4,
21]. Sustainable AI integration thus depends not only on technological access but also on instructional design and feedback literacy.
Overall, the findings indicate that proficiency-sensitive and pedagogically grounded AI-mediated feedback can support sustainable, equitable, and lifelong-oriented writing instruction in Turkish higher education. By demonstrating differential effects across SRL subscales and writing dimensions, this study contributes empirical evidence to ongoing debates on how AI technologies can be integrated responsibly and effectively into L2 writing pedagogy.
5. Conclusions
Grounded in sociocognitive theory and sustainability-oriented higher education policy, this study investigated the effects of automated feedback and generative AI feedback on Turkish L2 learners’ self-regulated learning strategies and writing performance, while examining the moderating role of CEFR-aligned proficiency levels.
The findings demonstrate that AI-empowered feedback can foster sustainable writing development when it strengthens learners’ self-regulatory capacities rather than focusing solely on surface-level correction. While all groups benefited from instruction over time, GenAI-F showed distinctive advantages in promoting metacognitive planning and motivational engagement, which were associated with stronger and more durable improvements in writing quality, particularly in content development.
Importantly, the effectiveness of AI feedback was proficiency-dependent. Intermediate learners benefited most from GenAI-F due to its adaptive scaffolding and explanatory affordances, whereas advanced learners showed greater gains under automated feedback. This pattern highlights the importance of equitable, learner-sensitive AI integration in mass higher education systems.
From a sustainability perspective, AI-supported writing pedagogy offers potential to reduce instructor workload, promote learner autonomy, and support differentiated instruction at scale. However, the findings also underscore the necessity of pedagogically guided AI use to ensure that social interaction, reflection, and self-evaluation remain central to writing instruction.
Within the context of Türkiye’s digital transformation agenda in higher education, this study provides evidence that AI tools should be strategically embedded within curricula rather than adopted as stand-alone solutions. Sustainable L2 writing instruction emerges from the balanced alignment of technology, pedagogy, and learner agency.
Despite limitations related to intervention duration and reliance on self-report measures, the study offers robust evidence that GenAI-F, when proficiency-aligned and pedagogically framed, can serve as a catalyst for sustainable L2 writing development. By foregrounding self-regulated and lifelong learning capacities, AI-supported feedback can meaningfully contribute to the broader sustainability mission of higher education in Türkiye and comparable contexts.
The findings of this study offer several pedagogically significant implications for sustainable L2 writing instruction in Turkish higher education, particularly in relation to Türkiye’s digital transformation agenda, YÖK’s strategic priorities [
31]., and universities’ responsibility to foster lifelong, self-regulated learners. In this context, sustainability extends beyond environmental considerations to include pedagogical durability, learner autonomy, and equitable access to high-quality feedback in mass higher education.
First, the differential effects of GenAI-F on metacognitive and motivational SRL strategies highlight its potential to address a persistent structural challenge in Turkish universities: large class sizes and limited opportunities for individualized formative feedback in compulsory English courses. As noted in YÖK policy reports [
32,
33], heavy teaching loads often constrain instructors’ capacity to provide sustained feedback on student writing. The present findings suggest that GenAI-F can function as a scalable supplementary feedback mechanism, supporting idea planning, goal setting, and interest regulation without increasing instructor workload. From a sustainability perspective, this aligns with resource-efficient pedagogy, where digital tools are leveraged to maintain instructional quality over time.
Second, the observed relationship between SRL strategy development and writing performance underscores the need to shift Turkish L2 writing pedagogy from predominantly product-oriented assessment toward process-oriented and self-regulatory approaches. Writing instruction in many Turkish universities continues to emphasize grammatical accuracy and exam-driven outcomes, shaped by centralized assessment traditions and CEFR alignment [
32]. The present results indicate that when learners are supported in regulating planning, monitoring, and motivation particularly through dialogic GenAI-F, writing development becomes more durable and transferable across tasks. Embedding SRL-focused pedagogy into university English curricula may therefore better support sustainable academic literacy development than short-term performance gains.
Third, the moderating role of CEFR proficiency level has important implications for equity and inclusivity in Türkiye’s digital education initiatives. Intermediate-level learners (CEFR B1–B2), who constitute the majority of first-year undergraduates in state universities, benefited more from GenAI-F than from AF feedback, whereas advanced learners (C1) showed greater gains under AF conditions. This finding cautions against one-size-fits-all approaches to AI integration, which may inadvertently amplify achievement gaps. Sustainable implementation requires differentiated feedback models, in which GenAI tools scaffold learners with developing proficiency, while more advanced learners are encouraged to critically engage with form-focused automated feedback.
Fourth, the mixed effects observed in social and emotional regulation strategies highlight the need for pedagogically guided AI use. While GenAI-F enhanced interest regulation, it was also associated with reduced reliance on peer- and instructor-mediated feedback. In the Turkish context, where collaborative learning and instructor guidance remain culturally valued, sustainable AI integration should be framed as complementary rather than substitutive. Combining GenAI-F with structured peer review, reflective writing tasks, and explicit SRL strategy instruction may help preserve the social dimension of learning while capitalizing on AI affordances.
Finally, these findings align closely with Türkiye’s broader sustainability and lifelong learning objectives articulated in national higher education and digitalization policies [
33,
34]. By fostering learners’ capacity to self-regulate their writing through informed and reflective use of AI feedback, universities can support the development of adaptive, autonomous, and digitally literate graduates prepared for continuous learning beyond formal education. Sustainable L2 writing pedagogy in Türkiye, therefore, is not merely a matter of technological adoption, but of cultivating enduring learner capacities that remain functional across evolving academic, professional, and technological contexts.
In sum, this study suggests that strategically differentiated, SRL-oriented integration of GenAI-F holds substantial promise for advancing sustainable L2 writing pedagogy in Turkish higher education. Future institutional efforts should focus not only on technology deployment, but also on pedagogical alignment, instructor professional development, and policy-level guidance to ensure that AI-enhanced writing instruction contributes meaningfully to long-term educational sustainability.
Author Contributions
Conceptualization, A.I. and A.E.S.; methodology, A.I., O.Y. and G.D.; formal analysis, O.Y. and G.D.; data curation, O.Y.; writing—original draft preparation, A.I., G.D., A.E.S. and O.Y.; writing—review and editing, A.E.S. and A.I.; visualization, G.D.; supervision, A.E.S. and A.I. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
The study was conducted in accordance with the Declaration of Helsinki, and approved by the Ethics Committee of Ankara Hacı Bayram Veli University (protocol code 11054618-302.07.98-, date of approval 21.03.2024).
Informed Consent Statement
Informed consent was obtained from all subjects involved in the study.
Data Availability Statement
The datasets generated and analyzed during the current study are available from the corresponding author on reasonable request. None of the experiments were preregistered.
Acknowledgments
The authors have reviewed and edited the output and take full responsibility for the content of this publication.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Kıvrak, C. Turkish EFL learners’ perceptions and preferences of written corrective feedback. J. Lang. Res. 2023, 7, 1–18. [Google Scholar] [CrossRef]
- Yılmaz, A. The effect of technology integration in education on prospective teachers’ critical and creative thinking, multidimensional 21st century skills and academic achievements. Participatory Educational Research 2021, 8(2), 163–199. [Google Scholar] [CrossRef]
- Wilson, J.; Roscoe, R.D. Automated writing evaluation and feedback: Multiple metrics of efficacy. J. Educ. Comput. Res. 2020, 58(1), 87–125. [Google Scholar] [CrossRef]
- Barrot, J.S. Using automated written corrective feedback in the writing classrooms: Efeects on L2 writing accuracy. Comput. Assist. Lang. Learn. 2023, 36(4), 584–607. [Google Scholar] [CrossRef]
- Godwin-Jones, R. Partnering with AI: Intelligent writing assistance and instructed language learning. Lang. Learn. Technol. 2022, 26, 5–24. [Google Scholar] [CrossRef]
- Su, Y.; Lin, Y.; Lai, C. Collaborating with ChatGPT in argumentative writing classrooms. Assess. Writ. 2023, 57, 100752. [Google Scholar] [CrossRef]
- Khojasteh, L.; Soori, A.; Javed, F. Comparing teacher E-feedback, AI feedback, and hybrid feedback in enhancing EFL writing skills. Technol. Lang. Teach. Learn. 2025, 7(3), 102626. [Google Scholar] [CrossRef]
- Kasneci, E.; Sessler, K.; Küchemann, S.; Bannert, M.; Dementieva, D.; Fischer, F.; Gasser, U.; Groh, G.; Günnemann, S.; Hüllermeier, E.; et al. ChatGPT for good? Opportunities and challenges of large language models for education. Learn. Individ. Differ. 2023, 103, 102274. [Google Scholar] [CrossRef]
- Zhang, Z.V.; Hyland, K. Student engagement with teacher and automated feedback on L2 writing. Assessing Writing 2018, 36, 90–102. [Google Scholar] [CrossRef]
- Zheng, Y.; Yu, S. Student engagement with teacher written corrective feedback in EFL writing: A case study of Chinese lower-proficiency students. Assess. Writ. 2018, 37, 13–24. [Google Scholar] [CrossRef]
- Alharbi, A.; Alsolami, T. The effectiveness of corpora on Saudi EFL academic writing performance. J. Appl. Stud. Lang. 2020, 4(2), 331–345. [Google Scholar] [CrossRef]
- Bandura, A. Social Foundations of Thought and Action: A Social Cognitive Theory; Prentice Hall: Englewood Cliffs, NJ, USA, 1985. [Google Scholar]
- Zimmerman, B.J.; Schunk, D.H. Handbook of Self-Regulation of Learning and Performance; Routledge: New York, NY, USA, 2011. [Google Scholar]
- Teng, L.S.; Zhang, L.J. A questionnaire-based validation of multidimensional models of self-regulated learning strategies. Modern Lang. J. 2016, 100(3), 1–15. [Google Scholar] [CrossRef]
- Dong, L. Self-Regulatory Writing Strategies and Second Language Writing Proficiency: Interplay, Influence, and Insights; Routledge, 2025. [Google Scholar] [CrossRef]
- Teng, L.S. Self-Regulated Learning and Language Learning Strategies. In Self-Regulated Learning and Second Language Writing; Teng, L.S., Ed.; Springer: Cham, Switzerland, 2022; pp. 15–30. [Google Scholar] [CrossRef]
- Hattie, J.; Timperley, H. The power of feedback. Rev. Educ. Res. 2007, 77, 81–112. [Google Scholar] [CrossRef]
- Yang, L.F.; Liu, Y.; Xu, Z. Examining the effects of self-regulated learning-based teacher feedback on English-as-a-foreign-language learners’ self-regulated writing strategies and writing performance. Front. Psychol. 2022, 13, 1027266. [Google Scholar] [CrossRef] [PubMed]
- Stevenson, M.; Phakiti, A. Automated Feedback and Second Language Writing. In Feedback in Second Language Writing: Contexts and Issues; Hyland, K., Hyland, F., Eds.; Cambridge University Press: Cambridge, UK, 2019; pp. 125–142. [Google Scholar] [CrossRef]
- Barrot, J.S. Using ChatGPT for second language writing: Pitfalls and potentials. Assess. Writ. 2023, 57, 100745. [Google Scholar] [CrossRef]
- Ranalli, J. Automated written corrective feedback: How well can students make use of it? Comput. Assist. Lang. Learn. 2018, 31, 653–674. [Google Scholar] [CrossRef]
- Jitpaisarnwattana, N.; Saville, N. Autonomous learning and students’ perceptions of automated writing evaluation as a tool to improve writing skills. Technol. Lang. Teach. Learn. 2025, 7, 102826. [Google Scholar] [CrossRef]
- Kırkgöz, Y. Globalization and English language policy in Turkey. Educ. Policy 2009, 23, 663–684. [Google Scholar] [CrossRef]
- Jacobs, H.L.; Zinkgraf, S.A.; Wormuth, D.R.; Hartfiel, V.F.; Hughey, J.B. Testing ESL Composition: A Practical Approach; Newbury House: Rowley, MA, USA, 1981. [Google Scholar]
- Walczak, A. Computer-adaptive testing. Res. Notes 2015, 59, 35–39. [Google Scholar]
- Field, A. Discovering Statistics Using IBM SPSS Statistics, 5th ed.; SAGE Publications: London, UK, 2018. [Google Scholar]
- Teng, L.S. Developmental trajectories of SRL: Evidence from a case study. In Self-Regulated Learning and Second Language Writing; Teng, L.S., Ed.; Springer: Cham, Switzerland, 2022; pp. 183–207. [Google Scholar] [CrossRef]
- Butler, D.L.; Winne, P.H. Feedback and self-regulated learning: A theoretical synthesis. Rev. Educ. Res. 1995, 65, 245–281. [Google Scholar] [CrossRef]
- Zimmerman, B.J. Investigating self-regulation and motivation: Historical background, methodological developments, and future prospects. Am. Educ. Res. J. 2008, 45, 166–183. [Google Scholar] [CrossRef]
- Matsuda, P.K. Feedback in Second Language Writing: Contexts and Issues; Hyland, K., Hyland, F., Eds.; Cambridge University Press: Cambridge, UK, 2006; Volume 8, pp. 75–77. [Google Scholar] [CrossRef]
- Council of Higher Education (YÖK). Higher Education Quality Assurance and CEFR Alignment in Foreign Language Education; YÖK: Ankara, Türkiye, 2020. [Google Scholar]
- Council of Higher Education (YÖK). Digital Transformation in Higher Education: Strategic Roadmap; YÖK: Ankara, Türkiye, 2021. [Google Scholar]
- Council of Higher Education (YÖK). Sustainability and Digitalization in Turkish Universities; YÖK: Ankara, Türkiye, 2023. [Google Scholar]
- Ministry of National Education. Digital content and skills-backed transformation of the learning process; MoNE: Ankara, Türkiye, 2023. [Google Scholar]
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |