Preprint
Article

This version is not peer-reviewed.

Visual System Alterations for Identifying Teacher-Reported Academic Difficulties in Schoolchildren: A Machine Learning Analysis

Submitted:

27 April 2026

Posted:

28 April 2026

You are already at the latest version

Abstract
Background/Objectives: Efficient visual processing plays an important role in children’s academic activities, particularly reading, writing, and sustained attention. Alterations in different visual subsystems may interfere with school performance, although their relative discriminative value remains unclear. This study evaluated the ability of five visual system alterations to identify teacher-reported academic difficulties in schoolchildren using machine learning models. Methods: An observational analytical study was conducted in 581 primary schoolchildren (mean age 8.47 ± 1.74 years; 53.4% girls). Academic performance was rated by teachers using a 1–5 scale and dichotomized as a pragmatic school-based indicator of academic difficulties. Five predictor groups were analyzed: objective oculomotor function assessed with DIVE, clinical oculomotor assessment, accommodative system, vergence system, and abnormal axial length. Five classifiers (support vector machine, k-nearest neighbors, decision tree, random forest, and XGBoost) were trained and tested using a 70/30 data split. Results: The accommodative system showed the highest classification performance (XGBoost: accuracy 1.00; macro-F1 1.00). Oculomotor alterations also demonstrated strong discriminative ability (DIVE: accuracy 0.921; macro-F1 0.878; clinical assessment: accuracy 0.934; macro-F1 0.918). Vergence variables showed low sensitivity despite high specificity, whereas axial length achieved lower overall performance (accuracy 0.678; macro-F1 0.558). Conclusions: Classification performance differed across visual domains, with functional systems directly involved in near vision tasks showing greater relevance than biometric measures. These findings support the inclusion of functional visual assessment in school screening programs and suggest that machine learning models may assist referral prioritization and educational support strategies. Further external validation is required.
Keywords: 
;  ;  ;  ;  ;  ;  ;  

1. Introduction

Poor academic performance in childhood may arise from multiple causes, including learning difficulties. These difficulties are multifactorial in nature and often occur in children with normal or even high intellectual ability [1,2]. Among the factors that may contribute to learning difficulties, visual dysfunction deserves particular attention in the school setting, where most relevant information is delivered through visual channels, including printed materials, whiteboards, digital screens, and reading or copying tasks [3]. Consequently, impairments in visual function may interfere with the acquisition, selection, and stabilization of information required for reading, writing, and sustained attention [4,5]. Since academic difficulties in childhood may have long-term consequences for psychosocial development, self-esteem, and future educational attainment, the early identification of modifiable contributing factors is clinically relevant.
Educational demands themselves have also been linked to visual development. Epidemiological and genetic studies have shown a robust association between educational exposure and visual phenotypes, particularly myopia, with evidence suggesting that the predominant direction of effect runs from education toward refractive change [6,7]. This supports the view that school-related visual demands, especially prolonged near work and reading, provide a relevant framework for examining how visual function may relate to academic outcomes.
Efficient visual performance depends on several integrated subsystems, including oculomotor control, vergence, and accommodation [5,8,9,10]. Oculomotor function supports reading through accurate fixation control and saccadic eye movements, which are essential for text navigation and spatial stability [11,12,13]. In schoolchildren, oculomotor performance has been associated with reading acquisition and reading speed, and higher rates of oculomotor anomalies have been reported in poor readers [14,15,16].
Likewise, the accommodative system is essential for maintaining clear focus during near tasks, and significant differences in accommodative amplitude and flexibility have been described in children with reading difficulties compared with controls [17]. In addition, accommodative and binocular dysfunctions, together with uncorrected refractive errors, have been associated with an increased likelihood of reading difficulties [18].
However, not all visual domains appear to have the same relationship with learning outcomes. Although orthoptic treatment improves symptoms and clinical signs of convergence insufficiency, randomized evidence has not consistently demonstrated superior gains in standardized reading performance after vergence/accommodative therapy [19,20]. Moreover, some studies have reported weak or absent associations between selected visual skills and reading ability [21,22]. These inconsistencies suggest that previous literature has often focused on isolated visual domains or specific outcomes, making it difficult to estimate the comparative contribution of multiple systems within the same analytical framework.
Similarly, axial length may be associated with academic exposure through its link with myopia development, but its relationship with learning difficulties is likely indirect, reflecting biometric rather than functional mechanisms [23,24,25]. Despite this, axial length has rarely been evaluated alongside functional visual systems as a potential predictor of academic difficulties.
Therefore, integrative approaches are needed to compare oculomotor, accommodative, vergence, and biometric domains simultaneously, particularly using analytical methods capable of modeling nonlinear relationships and interactions. Traditional statistical approaches may fail to capture complex nonlinear interactions among multiple visual subsystems, whereas machine learning models may improve predictive classification. Machine learning techniques may be especially useful for this purpose.
The aim of the present study was to evaluate the discriminative ability of five visual problem domains—objective oculomotor alterations assessed with DIVE, clinical oculomotor alterations, accommodative dysfunction, vergence dysfunction, and abnormal axial length— to identify teacher-reported academic difficulties in primary school children using machine learning models.
We hypothesized that functional systems directly involved in reading and visual attention, particularly oculomotor and accommodative domains, would show greater predictive value than biometric measures such as axial length. Early identification of visual profiles associated with learning difficulties may help optimize referral pathways and support individualized educational strategies.

2. Materials and Methods

2.1. Study Design

An analytical observational study was conducted to evaluate the discriminative ability of different visual anomalies for identifying teacher-reported academic difficulties in primary school children. The study protocol was approved by the Ethics Committee for Research with Medicines of Hospital Clínico San Carlos (Madrid, Spain) (Reference: 23/425-E).
Supervised machine learning classification models were used to determine the discriminative ability of different visual systems for identifying teacher-reported academic difficulties.

2.2. Participants

The sample consisted of 581 primary school children recruited from Educare Valdefuentes School, a private state-funded school located in Sanchinarro, Madrid (Spain).
The mean age of participants was 8.47 years (SD = 1.74; range: 5–12 years), and 53.4% were girls.
The initial sample (n = 581) was refined using complete-case analysis, excluding 75 children with incomplete data, resulting in a final dataset of 506 participants. This dataset was randomly divided into a training set (70%, n = 354) and an independent test set (30%, n = 152).
Five classification algorithms were trained: support vector machine (SVM), k-nearest neighbors (k = 3), decision tree, random forest, and extreme gradient boosting (XGBoost). Model performance was evaluated in the test set using discrimination and classification metrics, including receiver operating characteristic (ROC) curves, confusion matrices, and derived performance indicators.
To minimize potential bias related to data leakage, the training–test split was performed before any data preprocessing. All transformations were fitted exclusively on the training set and subsequently applied to the test set.
Figure 1. Machine learning pipeline. A total of 581 participants were included, with 75 excluded due to incomplete data (final n = 506), which was divided into 70% training (n = 354) and 30% test (n = 152).
Figure 1. Machine learning pipeline. A total of 581 participants were included, with 75 excluded due to incomplete data (final n = 506), which was divided into 70% training (n = 354) and 30% test (n = 152).
Preprints 210689 g001

2.3. Procedure

After obtaining authorization from the school administration, informed consent was distributed to the parents or legal guardians of participating children through Google Forms. The document included a detailed description of the study protocol, as well as information regarding the potential benefits and risks associated with participation.
Simultaneously, teachers responsible for each class completed a specific questionnaire to assess the academic performance of each participating child. Academic performance was rated using an ordinal 1–5 scale (1 = very poor; 5 = very good). Given the use of a non-standardized scale and the asymmetric distribution toward higher categories, potential ceiling effects and evaluator bias were considered. The variable was dichotomized according to a predefined criterion, with scores <3 classified as learning difficulties. Teacher ratings were selected as a practical school-based indicator of day-to-day academic functioning and were not intended to replace standardized psychoeducational or neuropsychological assessment.
All visual examinations were conducted individually by an experienced optometrist in a dedicated room within the school facilities.

2.4. Variables

2.4.1. Dependent Variable

The dependent variable was the presence of teacher-reported academic difficulties. Academic performance was rated by classroom teachers using a 1–5 ordinal scale, where 1 indicated very poor performance and 5 indicated very good performance. This measure was used as a pragmatic school-based indicator of academic functioning rather than as a formal neuropsychological diagnosis of a specific learning disorder.
For the purposes of binary classification, the variable was dichotomized according to a predefined criterion. Children with scores ≥3 were classified as not presenting teacher-reported academic difficulties, whereas children with scores <3 were classified as presenting teacher-reported academic difficulties.
The coding scheme was as follows:
0 = absence of teacher-reported academic difficulties (score ≥3)
1 = presence of teacher-reported academic difficulties (score <3)

2.4.2. Independent Variables

Five groups of visual variables were analyzed, each corresponding to a specific visual domain.
Group 1: Oculomotor Alterations Assessed with DIVE
Oculomotor performance was evaluated using the DIVE eye-tracking device (DIVE® Medical S.L., Zaragoza, Spain) device, which incorporates an eye-tracking system to quantify ocular motor behavior. The device provides a global oculomotor performance score, as well as specific scores for short and long fixations, smooth pursuits, and saccadic movements. Scores above 40 points were considered efficient, whereas scores below 40 indicated impaired performance. Variables were coded according to the presence or absence of oculomotor alterations.
Group 2: Clinical Oculomotor Alterations
Oculomotor function was also assessed using the clinical “double H” examination. Motility was rated on a 1–5 Likert scale, where 1 indicated very poor motility and 5 indicated very good motility. Scores <3 were classified as abnormal oculomotor performance.
Group 3: Accommodative System Alterations
The presence or absence of accommodative dysfunction was recorded. Monocular accommodative facility measured with flipper lenses was used as the reference test. Values outside age-adjusted normative ranges were classified as abnormal. Normative criteria are summarized in Table 1.
Group 4: Vergence System Alterations
The presence or absence of vergence dysfunction was determined using near point of convergence (NPC) and vergence testing. Results outside normative values were classified as abnormal according to the criteria presented in Table 1.
Group 5: Abnormal Axial Length
Axial length was measured using an optical biometer (IOLMaster 700 Swept Source Biometry®, Carl Zeiss Meditec AG, Jena, Germany). Ocular axial length was classified as abnormal when values fell outside the mean ± standard deviation of the study sample distribution. This criterion was used as an internal exploratory threshold rather than a clinical diagnostic cutoff.
For all groups, variables were coded as follows:
0 = absence of visual anomaly
1 = presence of visual anomaly

2.5. Statistical Analysis

To evaluate the discriminative ability of each visual variable group for identifying teacher-reported academic difficulties, supervised classification models were applied.
As described above, the dataset was divided into a training set (n = 354) and an independent test set (n = 152). For each group of visual predictors, five machine learning algorithms were trained: support vector machine (SVM), k-nearest neighbors (KNN; k = 3), decision tree, random forest, and extreme gradient boosting (XGBoost).
Model performance was assessed in the test set using classification metrics, including accuracy, macro F1-score, and confusion matrices. In addition, receiver operating characteristic (ROC) curves were generated to evaluate the discriminative ability of each model, and overall performance was quantified using the area under the ROC curve (AUC).
Sensitivity, specificity, positive predictive value, and negative predictive value were also derived from the confusion matrices for the best-performing model within each predictor group.
The 70/30 train–test split was used to provide an independent test set for estimating out-of-sample classification performance. Given the exploratory and comparative nature of the study, the main objective was not to develop a definitive clinical prediction tool, but to compare the relative discriminative performance of different visual domains under the same analytical conditions. Although an independent test set was used, no external validation cohort was available. Therefore, the reported performance metrics should be interpreted as internal validation results and require confirmation through repeated resampling procedures and external validation cohorts.
Additional model interpretability analyses, such as feature importance or precision–recall curves, were not included. However, these approaches represent relevant future directions for improving model interpretation and clinical applicability.
Analyses were performed using Python (version 3.11.7) with scikit-learn and XGBoost libraries.

3. Results

3.1. Participant Characteristics

A total of 581 children were initially recruited. After excluding 75 participants with incomplete data, 506 children were included in the final analysis. The mean age was 8.47 ± 1.74 years (range 5–12), and 53.4% were girls.

3.2. Comparative Classification Performance Across Visual Domains

The classification performance of five supervised classification algorithms was evaluated for each of the five visual domains analyzed. Model performance was assessed in the independent test set using accuracy, macro F1-score, confusion matrices, and receiver operating characteristic (ROC) curves.
Clear differences emerged across predictor groups (Table 2). The highest discriminative performance was observed for accommodative variables, followed by clinically assessed and instrumentally assessed oculomotor predictors. In contrast, vergence variables showed high specificity but poor sensitivity, whereas axial length demonstrated the lowest overall discriminative capacity.
Overall, these findings indicate that functional visual domains directly involved in near vision tasks may be more informative predictors of teacher-reported academic difficulties than vergence or biometric variables.
This pattern was not explained by accuracy alone. For example, vergence alterations achieved high accuracy mainly because most children in the test set were classified as negative cases, whereas sensitivity remained very low. Therefore, macro F1-score, sensitivity, and AUC provided a more informative assessment of model performance than accuracy alone.

3.3. Functional Domains with Highest Predictive Performance

3.3.1. Oculomotor Alterations Assessed with DIVE

For the predictor set corresponding to oculomotor alterations assessed with the DIVE device, XGBoost showed the best discriminative performance, achieving an accuracy of 0.921 and a macro F1-score of 0.878.
The confusion matrix showed 115 true negatives, 25 true positives, 11 false negatives, and 1 false positive. The model also achieved an AUC of 0.84, indicating good discriminative ability for teacher-reported academic difficulties (Figure 2).
This performance pattern suggests that objective eye-tracking-based oculomotor assessment captured a relevant signal associated with learning difficulties. However, the presence of 11 false-negative cases indicates that some children with teacher-reported academic difficulties were not identified by this predictor set alone.

3.3.2. Clinically Assessed Oculomotor Alterations

For clinically assessed oculomotor alterations (double H examination), XGBoost also showed strong classification performance, with an accuracy of 0.934 and a macro F1-score of 0.918.
The confusion matrix showed 105 true negatives, 37 true positives, 9 false negatives, and 1 false positive. The corresponding AUC was 0.90, indicating excellent discrimination (Figure 3).
Compared with DIVE-based assessment, the clinical oculomotor examination showed slightly higher sensitivity and macro F1-score, suggesting that the clinical double H examination may capture complementary aspects of ocular motility relevant to school performance.

3.3.3. Accommodative System Alterations

Among all predictor groups, accommodative variables showed the highest discriminative performance. XGBoost achieved an accuracy of 1.000 and a macro F1-score of 1.000.
The confusion matrix showed 129 true negatives and 23 true positives, with no false positives or false negatives. The model achieved an AUC of 1.00, indicating perfect discrimination in the test set (Figure 4).
This result suggests that accommodative alterations provided the strongest separation between children with and without teacher-reported academic difficulties in this dataset. Nevertheless, because error-free classification is unusual in clinical-educational data and may be influenced by the specific train–test split, this finding should be interpreted cautiously. Confirmation through repeated resampling procedures and external validation cohorts is required before considering this model clinically generalizable.

3.4. Domains with Lower Discriminative Utility

3.4.1. Vergence System Alterations

For the vergence predictor set, Random Forest achieved high overall accuracy (0.954), although macro F1-score was limited (0.599).
The confusion matrix showed 144 true negatives, 1 true positive, 7 false negatives, and no false positives. Although specificity was perfect (1.000), sensitivity was low (0.125). The AUC was 0.50, indicating no meaningful discriminative ability beyond chance (Figure 5).
This pattern indicates that the model was highly effective at identifying children without learning difficulties, but had limited ability to detect children with teacher-reported academic difficulties. Therefore, despite the high accuracy, vergence variables showed limited usefulness as standalone predictors in this dataset.

3.4.2. Abnormal Axial Length

For abnormal axial length, Random Forest was the best-performing model among those tested, although overall predictive performance was limited.
The model achieved an accuracy of 0.678 and a macro F1-score of 0.558. The confusion matrix showed 91 true negatives, 12 true positives, 46 false negatives, and 3 false positives. The AUC was 0.54, indicating poor discrimination between children with and without learning difficulties (Figure 6).
This indicates that axial length alone was not sufficient to distinguish children with and without learning difficulties. The high number of false-negative cases suggests that biometric information may have limited direct predictive value when considered independently from functional visual variables.
Overall, the results revealed a clear hierarchy of discriminative utility across visual systems. Accommodation showed the strongest performance, oculomotor variables provided robust and consistent discrimination across both objective and clinical assessments, whereas vergence and axial length showed limited ability to detect children with teacher-reported academic difficulties despite acceptable or high specificity.

4. Discussion

The present study demonstrates that the discriminative ability of visual alterations to identify learning difficulties differed substantially across visual systems. Machine learning models captured consistent differences between functional domains directly involved in near visual tasks (oculomotor and accommodative systems) and domains with a more indirect or multifactorial relationship (vergence and biometric variables). Overall, these findings support a school-based visual screening approach focused on visual efficiency beyond visual acuity alone and suggest that discriminative utility depends on the type of dysfunction assessed and its functional proximity to academically demanding tasks.
Consistent with previous studies linking oculomotor performance to reading acquisition, our results identified ocular motility alterations among the most informative predictors. Portnoy et al. [14] reported that Developmental Eye Movement (DEM)-based measures, particularly speed-related parameters, were associated with early reading performance and showed discriminative ability for identifying reduced reading speed. Similarly, Ibrahimi et al. [16] reported that a high proportion of children with poor reading abilities showed oculomotor difficulties when assessed with the Developmental Eye Movement test, supporting the relevance of saccadic control, visual sequencing, and fixation-related processes in reading performance. Our findings are in line with this framework: oculomotor performance, assessed both instrumentally (DIVE) and clinically (double H examination), emerged as a visual domain closely linked to reading and writing tasks requiring visuospatial sequencing, sustained visual attention, and accurate eye movement control.
At the accommodative level, our models showed the highest classification performance. This result is directionally consistent with previous clinical evidence identifying accommodative differences in children with reading difficulties. Palomo-Álvarez et al. [17] observed significantly reduced accommodative amplitude and binocular accommodative facility in children with reading difficulties compared with controls, supporting the view that efficient near focusing may influence sustained reading performance. In addition, Ceple et al. [18] reported that significant refractive error and/or accommodative-binocular dysfunctions were associated with a higher probability of reading difficulties, even in the absence of increased visual complaints. Together, these findings highlight the importance of systematically assessing accommodation in school settings.
However, the perfect classification observed for the accommodative predictor group should be interpreted cautiously. In real-world clinical-educational datasets, error-free classification may reflect a strong discriminative signal, but it may also be influenced by methodological factors such as variables highly correlated with the teacher-rated dichotomized outcome, the specific characteristics of the train–test split, class distribution, or overfitting. Therefore, this result should be considered exploratory and hypothesis-generating rather than definitive, and it represents a high-priority target for replication using repeated resampling procedures and external validation cohorts.
For the vergence domain, our results suggest a more limited discriminative ability, particularly for identifying positive cases. Although overall accuracy was high, this was largely driven by the predominance of negative cases in the test set, whereas sensitivity remained very low. This highlights the importance of interpreting classification metrics beyond accuracy alone when class imbalance is present.
This pattern is compatible with literature recognizing the impact of convergence insufficiency on symptoms and near-task performance, while reporting heterogeneous findings when outcomes are defined as standardized reading achievement. Scheiman et al. [19] demonstrated that office-based orthoptic therapy significantly improves symptoms and clinical signs such as near point of convergence and positive fusional vergence. However, the CITT-ART trial [20] did not demonstrate superior improvements over placebo therapy in standardized reading measures following vergence/accommodative therapy. This suggests that the pathway from improved binocular function and reduced symptoms to measurable gains in reading performance is neither automatic nor necessarily immediate. In our study, the lower discriminative ability of vergence variables may reflect mediation through intermediate factors such as symptoms, fatigue, compensatory strategies, and the multifactorial nature of learning.
Finally, axial length showed lower discriminative ability than functional systems directly involved in reading and attention. This finding is consistent with current models of myopia development. Population and genetic evidence indicates that educational exposure is robustly associated with myopia and that the predominant direction of effect runs from education toward refractive change [6,7], whereas axial length acts as a central biometric substrate of the refractive phenotype. In children, accelerated axial elongation markedly increases the risk of myopia [24], and combined refractive-biometric criteria have been proposed for risk stratification [25]. Nevertheless, these data suggest that the contribution of axial length to learning difficulties is usually indirect: biometrics may inform refractive risk and related visual consequences, but do not directly capture visuocognitive processes such as oculomotor control or selective visual attention that are immediately expressed during reading and writing.
From an applied perspective, the results support incorporating visual efficiency assessments, particularly oculomotor and accommodative testing, into school screening programs, as these domains showed greater ability to identify profiles at risk of learning difficulties. In addition, because in cerebral visual impairment and other neurovisual phenotypes the principal bottleneck may lie in higher-order visual functions such as visual search, figure-ground perception, and visual attention, as reported by Zihl et al. [27] and Hokken et al. [28], the present findings reinforce the need to interpret academic performance within an integrated framework that includes both ocular function and visual processing. In this context, machine learning models may serve as decision-support tools to prioritize clinical referrals and guide individualized educational adaptations, without replacing comprehensive clinical assessment or educational judgment [29].
The main strengths of this study include its integrative design, directly comparing five visual domains (objective and clinical oculomotor performance, accommodation, vergence, and biometrics) as predictors of learning difficulties within the same analytical framework. In addition, the comparison of several machine learning algorithms, including ensemble methods such as random forest and XGBoost, adds methodological robustness by allowing the modeling of nonlinear relationships and plausible interactions among clinical-functional variables. Data collection in a real school setting and the relatively large sample of primary school children enhance the practical relevance of the findings for screening and referral prioritization. Finally, the use of two complementary approaches for oculomotor assessment (DIVE and clinical examination) strengthens confidence in the consistency of this domain through multimethod evidence.
Several limitations should also be considered. Learning difficulties were defined using a teacher-rated Likert-scale measure that was subsequently dichotomized. Although teacher ratings are not equivalent to formal neuropsychological diagnosis, they may still provide useful school-based information regarding functional academic performance [30]. This approach may introduce measurement bias, including ceiling effects. Cross-sectional design prevents causal inference, and findings should therefore be interpreted as predictive or associative rather than causal. The sample was recruited from a single school, and participation depended on family consent, which may limit generalizability and introduce selection bias. Moreover, potentially relevant confounding variables were not systematically recorded. These include neurodevelopmental conditions such as dyslexia or attention-deficit/hyperactivity disorder, as well as socioeconomic background, language-related factors, prior academic support, and previous optical correction. Such variables may influence both school performance and visual test outcomes, and therefore could partially account for some observed associations.
In addition, several predictor groups were operationalized using binary classifications (presence/absence of anomaly) based on clinical thresholds. Although this approach improved comparability across visual domains and enhanced potential applicability for screening purposes, it may have reduced the richness of continuous clinical information and attenuated more subtle associations. Future studies should examine whether continuous or multidimensional representations of visual function improve predictive performance.
Finally, the low prevalence of positive cases in some predictor groups may have affected sensitivity and distorted global performance metrics, while the exceptionally high performance observed in certain domains requires further validation through external cohorts or repeated partitioning strategies to rule out overfitting and confirm model stability.

5. Conclusions

The present study suggests that the discriminative value of visual alterations for identifying teacher-reported academic difficulties in schoolchildren differs substantially across visual domains. Functional systems directly involved in near visual tasks, particularly oculomotor control and accommodation, showed greater discriminative performance than vergence measures or axial length. These findings support extending school-based vision screening beyond visual acuity alone to include assessments of visual efficiency that may be more closely related to academic functioning.
From a practical perspective, machine learning approaches may help integrate multidimensional visual data and assist in prioritizing referrals or guiding individualized educational support strategies. However, such tools should be considered complementary to, rather than replacements for, comprehensive clinical and educational assessment.
These findings should be interpreted in light of several limitations, including the use of a teacher-rated proxy measure for learning difficulties, the cross-sectional design, recruitment from a single school, and the need for external validation of the predictive models. Future studies should incorporate standardized neuropsychological outcomes, multicenter samples, longitudinal designs, and repeated validation procedures to confirm the robustness and clinical applicability of these results.

Author Contributions

Conceptualization, R.G.-J. and F.J.P.-M.; methodology, R.G.-J., J.R.T. and F.J.P.-M.; software, J.R.T.; validation, R.G.-J., J.R.T. and F.J.P.-M.; formal analysis, J.R.T.; investigation, R.G.-J.; resources, R.G.-J. and F.J.P.-M.; data curation, R.G.-J. and J.R.T.; writing—original draft preparation, R.G.-J.; writing—review and editing, F.J.P.-M., R.B.-V., J.E.C.-S. and C.O.-C.; visualization, J.R.T.; supervision, F.J.P.-M.; project administration, R.G.-J. and F.J.P.-M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and approved by the Ethics Committee for Research with Medicines of Hospital Clínico San Carlos, Madrid, Spain (protocol code 23/425-E and date of approval 20 September 2023).

Data Availability Statement

The data presented in this study are available from the corresponding author upon reasonable request. The data are not publicly available due to privacy and ethical restrictions involving minors.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
AUC Area under the curve
CITT-ART Convergence Insufficiency Treatment Trial–Attention and Reading Trial
DIVE Digital Vision Evaluation
FN False negatives
FP False positives
KNN k-nearest neighbors
ML Machine learning
NPC Near point of convergence
NPV Negative predictive value
PPV Positive predictive value
ROC Receiver operating characteristic
SVM Support vector machine
TN True negatives
TP True positives
XGBoost Extreme gradient boosting

References

  1. Skenderidou, I.; Leontopoulos, S.; Stafylis, N.; Skenderidis, P. School Dropout: Causes, Consequences, and Strategies for Prevention. Eur. J. Educ. Stud. 2025, 12, 176–196. [CrossRef]
  2. Fletcher, J.M.; Miciak, J. Assessment of Specific Learning Disabilities and Intellectual Disabilities. Assessment 2024, 31, 53–74. [CrossRef]
  3. American Academy of Optometry; American Optometric Association. Vision, Learning and Dyslexia: A Joint Organizational Policy Statement. American Academy of Optometry; American Optometric Association: United States, 1997.
  4. Silveira, S. Exploring the Dualism of Vision—Visual Function and Functional Vision. Vis. Rehabil. Int. 2018, 10, 1–10.
  5. Bennett, C.R.; Bex, P.J.; Bauer, C.M.; Merabet, L.B. The Assessment of Visual Function and Functional Vision. Semin. Pediatr. Neurol. 2019, 31, 30–40. [CrossRef]
  6. Mountjoy, E.; Davies, N.M.; Plotnikov, D.; Smith, G.D.; Rodriguez, S.; Williams, C.E.; Guggenheim, J.A.; Atan, D. Education and Myopia: Assessing the Direction of Causality by Mendelian Randomisation. BMJ 2018, 361, k2022. [CrossRef]
  7. Williams, K.M.; Bertelsen, G.; Cumberland, P.; Wolfram, C.; Verhoeven, V.J.M.; Anastasopoulos, E.; Buitendijk, G.H.S.; Cougnard-Grégoire, A.; Creuzot-Garcher, C.; Erke, M.G.; et al. Increasing Prevalence of Myopia in Europe and the Impact of Education. Ophthalmology 2015, 122, 1489–1497. [CrossRef]
  8. Quinet, J.; Schultz, K.; May, P.J.; Gamlin, P.D. Neural Control of Vergence and Ocular Accommodation. Annu. Rev. Vis. Sci. 2025, 11, 43–72. [CrossRef]
  9. Sánchez-González, M.C.; Palomo-Carrión, R.; De-Hita-Cantalejo, C.; Romero-Galisteo, R.P.; Gutiérrez-Sánchez, E.; Piñero-Pinto, E. Visual System and Motor Development in Children: A Systematic Review. Acta Ophthalmol. 2022, 100, e1356–e1369. [CrossRef]
  10. Rodrigues, P.; Woodburn, J.; Bond, A.J.; Stockman, A.; Vera, J. Light-Based Manipulation of Visual Processing Speed during Soccer-Specific Training Has a Positive Impact on Visual and Visuomotor Abilities in Professional Soccer Players. Ophthalmic Physiol. Opt. 2025, 45, 504–513. [CrossRef]
  11. Chamani, N.; Schmid, M.C.; Rima, S. Unstable Foveation’s Impact on Reading, Object Tracking, and Its Implications for Diagnosing and Intervening in Reading Difficulties. Sci. Rep. 2025, 15, 6546. [CrossRef]
  12. Laborde, Q.; Roques, A.; Armougum, A.; Vayatis, N.; Bargiotas, I.; Oudre, L. Vision Toolkit Part 2. Features and Metrics for Assessing Oculomotor Signal: A Review. Front. Physiol. 2025, 16, 1661026. [CrossRef]
  13. Kulp, M.T.; Schmidt, P.P. Effect of Oculomotor and Other Visual Skills on Reading Performance: A Literature Review. Optom. Vis. Sci. 1996, 73, 283–292. [CrossRef]
  14. Portnoy, A.; Gilaie-Dotan, S. Oculomotor-Related Measures Are Predictive of Reading Acquisition in First Grade Early Readers. Vision 2025, 9, 48. [CrossRef]
  15. Rodriguez, A.R.; Barton, J.J.S. The 20/20 Patient Who Can’t Read. Can. J. Ophthalmol. 2015, 50, 257–264. [CrossRef]
  16. Ibrahimi, D.; Aviles, M.; Rodríguez-Reséndiz, J. Oculomotor Patterns in Children with Poor Reading Abilities Measured Using the Development Eye Movement Test. J. Clin. Med. 2024, 13, 4415. [CrossRef]
  17. Palomo-Álvarez, C.; Puell, M.C. Accommodative Function in School Children with Reading Difficulties. Graefes Arch. Clin. Exp. Ophthalmol. 2008, 246, 1769–1774. [CrossRef]
  18. Ceple, I.; Švede, A.; Šerpa, E.; Kassaliete, E.; Volberga, L.; Miķelsone, R.; Krūmiņa, G. The Prevalence of Accommodative and Binocular Dysfunctions in Children with Reading Difficulties. Life 2025, 15, 7. [CrossRef]
  19. Scheiman, M.; Mitchell, G.L.; Cotter, S.; Cooper, J.; Kulp, M.; Rouse, M.; Borsting, E.; London, R.; Wensveen, J.; Convergence Insufficiency Treatment Trial Study Group. A Randomized Clinical Trial of Treatments for Convergence Insufficiency in Children. Arch. Ophthalmol. 2005, 123, 14–24. [CrossRef]
  20. CITT-ART Investigator Group. Effect of Vergence/Accommodative Therapy on Reading in Children with Convergence Insufficiency: A Randomized Clinical Trial. Optom. Vis. Sci. 2019, 96, 836–849. [CrossRef]
  21. Kiely, P.M.; Crewther, S.G.; Crewther, D.P. Is There an Association between Functional Vision and Learning to Read? Clin. Exp. Optom. 2001, 84, 346–353. [CrossRef]
  22. Kaye, G. Vision and Learning to Read. Clin. Exp. Optom. 2002, 85, 111. [CrossRef]
  23. Yang, Y.; Li, R.; Ting, D.; Wu, X.; Huang, J.; Zhu, Y.; Chen, C.; Lin, H.; Chen, W. The Associations of High Academic Performance with Childhood Ametropia Prevalence and Myopia Development in China. Ann. Transl. Med. 2021, 9, 745. [CrossRef]
  24. Tideman, J.W.L.; Polling, J.R.; Vingerling, J.R.; Jaddoe, V.W.V.; Williams, C.; Guggenheim, J.A.; Klaver, C.C.W. Axial Length Growth and the Risk of Developing Myopia in European Children. Acta Ophthalmol. 2018, 96, 301–309. [CrossRef]
  25. McCullough, S.; Adamson, G.; Breslin, K.M.M.; McClelland, J.F.; Doyle, L.; Saunders, K.J. Axial Growth and Refractive Change in White European Children and Young Adults: Predictive Factors for Myopia. Sci. Rep. 2020, 10, 15189. [CrossRef]
  26. Morgan, M.W. The Analysis of Clinical Data. Am. J. Optom. Arch. Am. Acad. Optom. 1944, 21, 477–491.
  27. Zihl, J.; Unterberger, L.; Lippenberger, M. Visual and Cognitive Profiles in Children with and without Cerebral Visual Impairment. Br. J. Vis. Impair. 2024, 42, 557–576. [CrossRef]
  28. Hokken, M.J.; Stein, N.; Pereira, R.R.; Rours, I.G.I.J.G.; Frens, M.A.; van der Steen, J.; et al. Eyes on CVI: Eye Movements Unveil Distinct Visual Search Patterns in Cerebral Visual Impairment Compared to ADHD, Dyslexia, and Neurotypical Children. Res. Dev. Disabil. 2024, 151, 104767. [CrossRef]
  29. Rajkomar, A.; Dean, J.; Kohane, I. Machine Learning in Medicine. N. Engl. J. Med. 2019, 380, 1347–1358. [CrossRef]
  30. Hlutkowsky, C.O.; All, K.E.; Roule, A.L.; Warner, T.A.; Huang-Pollock, C. A Comparison of Commercially Available Parent and Teacher Rating Forms in the Concurrent Prediction of Executive Functioning Performance in Children. J. Atten. Disord. 2026, 30, 99–115. [CrossRef]
Figure 2. Receiver operating characteristic (ROC) curve of the XGBoost classifier for oculomotor alterations assessed with the DIVE device. The model achieved an AUC of 0.84, indicating good discriminative ability for teacher-reported academic difficulties.
Figure 2. Receiver operating characteristic (ROC) curve of the XGBoost classifier for oculomotor alterations assessed with the DIVE device. The model achieved an AUC of 0.84, indicating good discriminative ability for teacher-reported academic difficulties.
Preprints 210689 g002
Figure 3. Receiver operating characteristic (ROC) curve of the XGBoost classifier for clinically assessed oculomotor alterations (double H examination). The model achieved an AUC of 0.90, indicating excellent discriminative ability for teacher-reported academic difficulties.
Figure 3. Receiver operating characteristic (ROC) curve of the XGBoost classifier for clinically assessed oculomotor alterations (double H examination). The model achieved an AUC of 0.90, indicating excellent discriminative ability for teacher-reported academic difficulties.
Preprints 210689 g003
Figure 4. Receiver operating characteristic (ROC) curve of the XGBoost classifier for accommodative system alterations. The model achieved an AUC of 1.00, indicating perfect discrimination between children with and without teacher-reported academic difficulties in the test set.
Figure 4. Receiver operating characteristic (ROC) curve of the XGBoost classifier for accommodative system alterations. The model achieved an AUC of 1.00, indicating perfect discrimination between children with and without teacher-reported academic difficulties in the test set.
Preprints 210689 g004
Figure 5. Receiver operating characteristic (ROC) curve of the Random Forest classifier for vergence system alterations. The model achieved an AUC of 0.50, indicating no meaningful discriminative ability beyond chance despite high overall accuracy driven by class imbalance.
Figure 5. Receiver operating characteristic (ROC) curve of the Random Forest classifier for vergence system alterations. The model achieved an AUC of 0.50, indicating no meaningful discriminative ability beyond chance despite high overall accuracy driven by class imbalance.
Preprints 210689 g005
Figure 6. Receiver operating characteristic (ROC) curve of the Random Forest classifier for abnormal axial length. The model achieved an AUC of 0.54, indicating poor discriminative ability for teacher-reported academic difficulties.
Figure 6. Receiver operating characteristic (ROC) curve of the Random Forest classifier for abnormal axial length. The model achieved an AUC of 0.54, indicating poor discriminative ability for teacher-reported academic difficulties.
Preprints 210689 g006
Table 1. Clinical criteria used to classify accommodative and vergence anomalies. Normative values adapted from Morgan (1944) [26].
Table 1. Clinical criteria used to classify accommodative and vergence anomalies. Normative values adapted from Morgan (1944) [26].
Visual Domain Clinical Test Criterion for Abnormality
Accommodation Monocular accommodative facility Outside age-adjusted normative range
Vergence Near point of convergence Outside normative range
Vergence Fusional vergences Outside normative range
Accommodation Amplitude/facility (if used) Outside normative range
Table 2. Comparative Classification Performance Across Visual Domains (best-performing model for each predictor set).
Table 2. Comparative Classification Performance Across Visual Domains (best-performing model for each predictor set).
Predictor Set (Visual Domain) Best Model (Test Set) Accuracy Macro F1 TN FP FN TP Sensitivity Specificity PPV NPV Prevalence in Test Set
1. Oculomotor Alterations (DIVE) XGBoost 0.921 0.878 115 1 11 25 0.694 0.991 0.962 0.913 0.237
2. Oculomotor Alterations (Double H) XGBoost 0.934 0.918 105 1 9 37 0.804 0.991 0.974 0.921 0.303
3. Accommodative System Alterations XGBoost 1.000 1.000 129 0 0 23 1.000 1.000 1.000 1.000 0.151
4. Vergence System Alterations Random Forest 0.954 0.599 144 0 7 1 0.125 1.000 1.000 0.954 0.053
5. Abnormal Axial Length Random Forest 0.678 0.558 91 3 46 12 0.207 0.968 0.800 0.664 0.382
Note: TN = true negatives; FP = false positives; FN = false negatives; TP = true positives; PPV = positive predictive value; NPV = negative predictive value. Sensitivity = TP/(TP+FN); specificity = TN/(TN+FP). Predictive values depend on the prevalence of learning difficulties in the test set.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated