Submitted:
24 September 2025
Posted:
25 September 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
- Novel, Modular Framework: We integrate XGBoost, CatBoost, LightGBM, Random Forests, Gradient Boosting, and neural models into a single pipeline with robust hyperparameter tuning.
- Handling Class Imbalance: Dyslexia forms a minority class in the dataset. We employ both oversampling (SMOTE) and algorithm-specific weighting (scale_pos_weight) to enhance minority-class recall.
- Benchmark Against Prior Studies: We compare our approach to an earlier dyslexia-screening model that relied on analyzing textual reading speed and writing errors, highlighting the advantages of capturing multi-modal (click-based) interaction data [12].
- Interpretability and Detailed Analysis: We present standard confusion matrices, classification reports, and key performance indicators (KPIs). We further analyze feature importances to elucidate which exercises and performance measures are most indicative of dyslexia risk.
2. Literature Review
2.1. Dyslexia in Transparent Orthographies
2.2. Traditional Diagnostic Protocols
2.3. Machine-Learning-Based Screening
2.4. Comparison to a Prior Study
2.5. Research Gaps
3. Materials and Methods
3.1. Ethical Compliance and Design
3.2. Participants
- Group 1 (Dyslexia): 392 participants (approx. 10.7% of sample)
- Group 2 (No Dyslexia): 3,252 participants (approx. 89.3% of sample)
3.3. Gamified Test and Data Features
- Gender: 1 = Male, 2 = Female
- Age: Ranging from 7 to 17
- Nativelang: 0 = No, 1 = Yes (Spanish as native language)
- Otherlang: 0 = No, 1 = Yes (speaks more than one language)
- Clicks (Ci): Number of click actions within exercise i
- Hits (Hi): Number of correct actions in exercise i
- Misses (Mi): Number of incorrect actions in exercise i
- Score (Si): Weighted sum of hits in exercise i
- Accuracy (Ai): HiCi\frac{Hi}{Ci}CiHi
- Missrate (Ri): MiCi\frac{Mi}{Ci}CiMi
3.4. Data Splits and Preprocessing
3.5. Class Imbalance Handling
- Scale_pos_weight: For XGBoost and CatBoost, we used scale_pos_weight=#(negatives)#(positives)≈8.28\text{scale\_pos\_weight} = \frac{\#(\text{negatives})}{\#(\text{positives})}\approx8.28scale_pos_weight=#(positives)#(negatives)≈8.28, ensuring the model penalizes misclassifications of the minority class more heavily [42].
- Synthetic Minority Over-Sampling Technique (SMOTE): This technique synthesizes new minority samples in feature space, especially beneficial for classifiers that do not have built-in weighting [43].
3.6. Model Training and Hyperparameter Tuning
- XGBoost (eXtreme Gradient Boosting) [44]
- CatBoost (Categorical Boosting) [45]
- LightGBM (Light Gradient Boosting Machine) [46]
- Random Forest [47]
- Gradient Boosting [48]
- Logistic Regression with balanced class weights [49]
- SVM (Support Vector Machine) with RBF kernel and class_weight=balanced [50]
- MLP (Multi-Layer Perceptron), a feedforward neural network [51]
3.7. Ensemble Methods
3.8. Evaluation Metrics
- Accuracy: True Positives + True NegativesTotal Samples\frac{\text{True Positives + True Negatives}}{\text{Total Samples}}Total SamplesTrue Positives + True Negatives
- Precision (Class 1): True PositivesTrue Positives + False Positives\frac{\text{True Positives}}{\text{True Positives + False Positives}}True Positives + False PositivesTrue Positives
- Recall (Class 1): True PositivesTrue Positives + False Negatives\frac{\text{True Positives}}{\text{True Positives + False Negatives}}True Positives + False NegativesTrue Positives
- F1-score (Class 1): 2×Precision×RecallPrecision+Recall2\times \frac{\text{Precision}\times\text{Recall}}{\text{Precision} + \text{Recall}}2×Precision+RecallPrecision×Recall
- Confusion Matrix: A 2x2 matrix enumerating the distribution of predictions across true classes [55].
4. Results
4.1. Exploratory Data Analysis (Desktop Dataset)
4.2. Hyperparameter Tuning: XGBoost
4.3. CatBoost Model
4.4. XGBoost + SMOTE
4.5. Ensemble (Majority Vote)
4.6. Feature Importance
5. Discussion
5.1. Comparison with Previous Study
5.2. Practical Significance and Limitations
5.3. Benchmarking the Age Groups
5.4. The Significance of Customizable, Web-Based Screening
6. Conclusion
- Longitudinal Studies: Examining test–retest reliability over extended periods to determine how stable these interaction metrics remain.
- Multilingual Extensions: Replicating or adapting the test for other transparent orthographies, such as Italian or Finnish, and comparing performance across languages.
- Comorbidity Analysis: Integrating modules to screen for ADHD or dyscalculia in parallel, thereby improving classification specificity.
- Deployment Feasibility: Conducting pilot programs in diverse school districts, measuring real-world adoption and compliance.
- Adaptive Testing: Implementing dynamic item selection to adapt the difficulty based on real-time user performance, potentially minimizing testing time without sacrificing predictive power.
References
- International Dyslexia Association. Definition of Dyslexia. Baltimore, MD: IDA; 2002.
- Snowling MJ. Dyslexia. Journal of Child Psychology and Psychiatry. 2000;41(1):3-20.
- Caravolas M. The nature and causes of dyslexia in different languages. Child Development Perspectives. 2018;12(3):170–175.
- Ramus F, Szenkovits G. What phonological deficit? Quarterly Journal of Experimental Psychology. 2008;61(1):129–141.
- Ziegler JC, Goswami U. Reading acquisition, developmental dyslexia, and skilled reading across languages. Psychological Bulletin. 2005;131(1):3–29. [CrossRef]
- Li G, Chang T. Machine learning for early detection in educational contexts. Computers & Education. 2021;160:104033.
- Esteva A, Kuprel B, Novoa R, Ko J, Swetter SM, Blau HM, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542(7639):115–118. [CrossRef]
- Petretto DR, Masala C, Masala C. Dyslexia and Specific Learning Disorders: New International Diagnostic Criteria. Journal of Childhood Development Disorders. 2019;5(1):1–10.
- Rello L, Ballesteros M. Detecting dyslexia using eye tracking measures. Proc. W4A ‘15: 12th Web For All Conference. 2015; Article 16.
- Rauschenberger M, Heuer S, Baeza-Yates R. How to design a web-based educational test. International Journal of Human–Computer Studies. 2020;145:102507.
- Scarborough H. Phonological core variable orthographic differences. Annals of Dyslexia. 1985;35(1):136–149.
- Babineau G, Kar P, Soman S. Machine learning for transparent languages. Applied Linguistics Research Journal. 2018;2(1):23–38.
- American Psychological Association. Publication Manual of the APA (7th ed.). Washington, DC: APA; 2020.
- Stone CA, Silliman ER, Ehren BJ, Apel K. Handbook of Language and Literacy. New York: The Guilford Press; 2016.
- Share DL. Phonological recoding and self-teaching. Developmental Review. 1995;15(4):449–506. [CrossRef]
- Wimmer H, Schurz M. Dyslexia in regular orthographies: A brief update on current research. Dyslexia. 2010;16(4):296–301.
- Visser J. Developmental dyslexia: A research-based morphological approach. Reading and Writing. 2018;31(9):2113–2129.
- Landerl K, Wimmer H. Development of word reading fluency and spelling in a consistent orthography. Reading and Writing. 2008;21(5):505–527. [CrossRef]
- Griffiths Y, Stuart M. Reviewing evidence-based practice in children’s literacy. Educational Psychology in Practice. 2013;29(1):1–20.
- Ferrer E, Shaywitz BA, Holahan JM, Marchione K, Michaels R, et al. Achievement gap in reading is persistent. Journal of Learning Disabilities. 2015;48(4):363–378.
- Nicolson RI, Fawcett AJ, Dean P. Dyslexia, development and the cerebellum. Trends in Neurosciences. 2001;24(9):515–516. [CrossRef]
- Seymour PHK. Early reading development in European orthographies. British Journal of Psychology. 2003;94(2):143–174.
- Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning. New York: Springer; 2009.
- Rello L, Bigham JP. Good Background Colors for Readers: A Study of People with and without Dyslexia. Proc. ASSETS ‘17. 2017; Article 72.
- Perea M, Panadero V. The use of eyetracking in diagnosing dyslexia. Reading Research Quarterly. 2014;49(1):3–14.
- Gregor M, Dickinson A. Cognitive approaches to text-based interactions for dyslexic readers. Universal Access in the Information Society. 2007;6(4):353–366.
- Kastrin A, Rando Q, Rapee G. Keystroke dynamics for detection of reading disorders. Expert Systems with Applications. 2020;142:113019.
- Vellutino FR, Fletcher JM, Snowling MJ, Scanlon DM. Specific reading disability. Journal of Child Psychology and Psychiatry. 2004;45(1):2–40.
- Rello L, Baeza-Yates R, Ali A, et al. Dyslexia in Spanish: Data from a Large-Scale Online Test. PLoS ONE. 2020;15(12):e0241687.
- Onishi-Kuri M, Albero G, Ramírez G. Enhancing random forest classification. Applied Soft Computing. 2021;111:107647.
- Buda M, Maki A, Mazurowski M. A systematic study of the class imbalance problem. Neural Networks. 2018;106:249–259. [CrossRef]
- Carnegie Mellon University Institutional Review Board. Approved Protocol #CMU-XXXX. Pittsburgh, PA: CMU IRB; 2019.
- White RM, Kelly F. Ethical issues in child research. Psychology Research. 2020;122(4):112–120.
- Livingstone S, Blum-Ross A. Families and screen time. Journal of Family Studies. 2021;27(1):146–161.
- Fuchs LS, Fuchs D, Compton DL. Dyslexia and the inadequate response to intervention model. British Journal of Educational Psychology. 2012;82(2):1–11.
- Abadiano H, Turner J. Analyzing Spanish reading errors. Reading Horizons. 2003;43(4):239–250.
- W3C. HTML5 Recommendation. World Wide Web Consortium. 2014.
- O’Reilly UM, Veeramachaneni K. On the design of online test harnesses. ACM eLearn Magazine. 2015;2(3):8–15.
- Powers DMW. Evaluation: From precision, recall and F-measure to ROC, informedness, markedness & correlation. Journal of Machine Learning Technologies. 2011;2(1):37–63.
- Friedman J, Hastie T, Tibshirani R. Additive logistic regression. Annals of Statistics. 2000;28(2):337–374.
- Efron B, Tibshirani RJ. An Introduction to the Bootstrap. New York: Chapman & Hall; 1993.
- Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. Proc. KDD ‘16. 2016;13(2):785–794.
- Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research. 2002;16(1):321–357. [CrossRef]
- Zhang Y, Ling CX. A strategy to apply machine learning to small datasets in computational biology. BMC Bioinformatics. 2018;19(1):51.
- Dorogush AV, Ershov V, Gulin A. CatBoost: Gradient boosting with categorical features support. arXiv preprint. 2018;arXiv:1810.11363.
- Ke G, Meng Q, Finley T, Wang T, Chen W, et al. LightGBM: A highly efficient gradient boosting decision tree. Proc. NIPS ‘17. 2017;3146–3154.
- Breiman L. Random forests. Machine Learning. 2001;45(1):5–32.
- Friedman JH. Greedy function approximation: A gradient boosting machine. Annals of Statistics. 2001;29(5):1189–1232. [CrossRef]
- Tibshirani R. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. 1996;58(1):267–288. [CrossRef]
- Cortes C, Vapnik V. Support-vector networks. Machine Learning. 1995;20(3):273–297.
- Rumelhart DE, Hinton GE, Williams RJ. Learning representations by back-propagating errors. Nature. 1986;323(6088):533–536. [CrossRef]
- Pedregosa F, Varoquaux G, Gramfort A, Michel V, et al. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research. 2011;12:2825–2830.
- Kuncheva LI. Combining pattern classifiers: Methods and algorithms. John Wiley & Sons. 2014.
- Peres A, Nir M, Melnick T. Medical screening tests. Best Practice & Research Clinical Obstetrics and Gynaecology. 2018;47:75–85.
- Fawcett T. An introduction to ROC analysis. Pattern Recognition Letters. 2006;27(8):861–874.
- Hanley JA, McNeil BJ. The meaning and use of the area under a ROC curve. Radiology. 1982;143(1):29–36.
- He H, Garcia EA. Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering. 2009;21(9):1263–1284.
- Ehri LC, Nunes SR, Willows DM, Schuster BV, Yaghoub-Zadeh Z, Shanahan T. Phonemic awareness instruction. Reading Research Quarterly. 2001;36(3):250–287.
- Rello L, Baeza-Yates R, Ali A, Llisterri J, et al. Online web-based detection. PLOS ONE. 2020;15(12):e0241687.
- Hernández-Valle I, Mateos A, González-Salinas C. Dyslexia detection in Spanish with morphological analysis. Applied Linguistics. 2019;40(4):1–25.
- L’Allier SK, Elish-Piper L. Early literacy interventions. The Reading Teacher. 2019;72(4):421–428.
- Snow CE, Burns MS, Griffin P. Preventing reading difficulties in young children. Washington, DC: National Academy Press; 1998.
- Shaywitz SE. Overcoming dyslexia. Knopf. 2020.
- Lyon GR, Shaywitz SE, Shaywitz BA. A definition of dyslexia. Annals of Dyslexia. 2003;53(1):1–14. [CrossRef]
- Landerl K, Moll K. Comorbidity of learning disorders. Zeitschrift für Kinder- und Jugendpsychiatrie und Psychotherapie. 2010;38(3):145–152.
- Cardoso-Martins C, Pinheiro AMV. The reading skills of children with dyslexia. Reading and Writing: An Interdisciplinary Journal. 2021;34(3):509–531.
- Rello L, Baeza-Yates R. Dyslexia for Spanish. CHI Extended Abstracts. 2013;989–994.
- Martín-González S, Cuevas-Nunez T. Age of detection in reading difficulties. European Journal of Special Needs Education. 2019;34(3):297–307.
- Molfese V, Beswick J, Molnar A, et al. Developmental outcomes of reading interventions. Developmental Psychology. 2022;58(2):109–124.
- Wood CL, Connelly V. Reading difficulties in resource-limited contexts. International Journal of Educational Research. 2019;97:147–158.
- Pupillo L. Digital skills in education. OECD Education Working Papers. 2018;198:1–46.
- Malone TW. Toward a theory of intrinsically motivating instruction. Cognitive Science. 1981;5(4):333–369. [CrossRef]





Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).