1. Introduction and Related Works
Dyslexia is unrelated to intelligence, and those affected can have average or above-average intellectual abilities. It is a lifelong condition that cannot be eradicated but can be effectively managed through appropriate interventions and support. It is estimated that dyslexia impacts approximately 5-10% of the population, and early detection and intervention can greatly enhance an individual's academic performance and overall quality of life.
Dyslexic individuals encounter difficulties in reading, writing, word formation, letter naming, numerical understanding, and color recognition. Consequently, dyslexia is a neurological condition that manifests as challenges in reading, writing, and spelling. The Yale Center for Dyslexia and Creativity consolidates research on dyslexia
1, shedding light on its nature and effects on individuals. Numerous publications delve into defining the scientific aspects of dyslexia – from diagnosis to treatment – categorizing them as follows:
"Overcoming Dyslexia" [
1] provides a detailed exploration of dyslexia's causes, diagnosis methods, and practical strategies for parents, educators, and individuals with dyslexia.
"Dyslexia: A Practitioner's Handbook" [
2] offers an extensive overview of dyslexia incorporating contemporary research findings along with assessment techniques and treatment approaches.
"Understanding Dyslexia and Other Learning Disabilities" [
3] presents insights into dyslexia alongside other learning disorders concerning their origins, identification procedures, diagnostic criteria, and available treatments.
"The Dyslexic Advantage" [
4] challenges conventional views by highlighting the strengths and benefits possessed by individuals with dyslexia rather than focusing solely on deficits.
"Multisensory Teaching of Basic Language Skills" outlines effective multisensory teaching methods tailored for individuals with dyslexia or other language-based learning challenges.
Dyslexia stands out as one of the most impactful learning disorders affecting an estimated 5-12% of the population [5-7]. This condition significantly hampers educational achievements as well as the social-emotional well-being of those affected.
Despite having normal vision and intelligence levels comparable to non-dyslexic individuals, scientific indicators such as difficulty in pronouncing unfamiliar words due to auditory issues are common among people with dyslexia. Notably different eye movements during reading compared to typical readers have been observed in individuals with dyslexia [
5]. Research continues to explore these differences associated with eye movements to better understand the underlying difficulties posed by this condition. Deciphering and identifying printed words during reading is a crucial aspect [5, 6]. Utilizing speech as a means to stimulate brain functions and enhance the quality of life for individuals with disabilities has been explored in laboratory experiments focusing on speech imagery for practical applications.
The integration of deep learning through electroencephalography (EEG) has shown promising results in speech assessment compared to conventional machine learning techniques [
8]. The realm of artificial intelligence (AI) is increasingly being explored in healthcare and medicine, particularly concerning youth and children [12, 13].
Research has delved into various dyslexia groups, evaluating 89 children aged 8-12 using three machine learning technologies categorized as normal, neutral, or dyslexic. A study involving 2679 children aged 7-9 revealed a strong correlation between reading speed and accuracy. Moreover, an experimental approach incorporating 32 linguistic reading exercises for Spanish speakers included 3644 participants aged between 7-17 [
11].
Further investigations on eye tracking technology employing Random Forest (RF) and Support Vector Machine (SVM) classifiers have highlighted SVM as the preferred choice due to its higher accuracy compared to the Random Forest model [12, 13]. Early detection models for dyslexia aim to identify symptoms such as delayed speech development, challenges in acquiring new vocabulary, difficulty in word formation, or struggles with recalling letters, numbers, or colors before formal schooling begins [
14].
Early diagnosis plays a pivotal role in offering timely assessments and interventions for children with dyslexia, potentially leading to improved outcomes later in life. Studies by Al-Barhamtoshy and Motaweh underscore the significance of writing activities as indicative of dyslexia. Observations suggest that writing and typing tasks exhibit high sensitivity towards detecting dyslexia-related patterns in the left hemisphere of the brain compared to the right hemisphere. A study emphasizing Theta power spectrum alterations in the left frontal region of these children's brains further underlines its association with dyslexia indicators. Notably, an increase in Theta activity correlates with heightened mental workload [
15]. A retrospective research endeavor involving EEG analysis during the COVID-19 pandemic focused on understanding EEG signal characteristics among 36 patients [
16]. Our study utilizes machine learning and deep learning techniques to detect attention deficit, reading, and writing challenges through standard activities.
The subsequent sections will delve into the dataset we developed and the machine/deep learning models employed.
Section 2 will detail the research methodology, while
Section 3 will present our framework for analyzing and diagnosing dyslexia in comparison to others, along with the corresponding computed results.
Section 4 will provide additional discussion and comparative assessments of related studies. Lastly, a summary and concluding remarks will be outlined in
Section 5.
2. Research Methodology
In terms of research methodology, a systematic approach was adopted to identify relevant dyslexia issues and appropriate dataset domains. This involved defining digitized datasets and key deep-learning topics. Following the collection of digitized data, we studied and analyzed the most pertinent dyslexia indicators before implementing them in our work. Notably, EEG was used to interpret input signals from individuals with learning disabilities alongside conventional learners.
Data acquisition was facilitated through Massive Open Online Courses (MOOC) video clips to enable flexible learning opportunities. Subsequently, all data underwent comprehensive analysis involving processing, labeling, feature extraction, classification, and detection. The EEG data covered five bands of dyslexia indicators, with a preprocessing model employed to eliminate signal artifacts such as spikes or movements.
The methodological design encompassed sequential stages starting from data collection through to processing. Initial preprocessing steps focused on noise removal from EEG signals to ensure signal quality. Subsequently, cleaned EEG data underwent feature extraction utilizing Mel Frequency Cepstral Coefficients (MFCC), elliptic filters, and pass filters across five frequency bands. A central inquiry driving this study is the relationship between confusion and dyslexia since confusion plays a critical role in dyslexia detection. Individuals with dyslexia often experience confusion when reading or spelling due to their condition's nature; thus, an individual's threshold for confusion significantly impacts their frequency of errors.
The dataset utilized in this study was sourced from
https://www.kaggle.com/wanghaohan/confused-eeg containing snippets from online courses offered by edX [
17]. Brain activity recordings were obtained from ten university students who watched these courses.
To conclude this research effectively, it is essential to address key questions regarding the primary indicators utilized in predicting dyslexia bands among individuals. The effectiveness of utilizing deep learning in diagnosing dyslexia and the study of attention and mediation data in this context is explored. The paper investigates key features that differentiate dyslexia from neurotypical individuals. Additionally, eye tracking during reading is employed to identify children at risk of dyslexia and long-term reading difficulties. By capturing real-time eye movements during reading, accurate assessments of the reading process can be obtained without requiring unrelated responses from the subject. This method offers a continuous record of reading efficiency and precision, distinguishing it from current screening techniques.
3. Model Architecture
The model architecture outlined in
Figure 1 depicts the sequential stages of the proposed dyslexia classification framework. Beginning with data collection, the framework progresses through preprocessing, analysis, classification, and evaluation phases. Preprocessing involves utilizing smooth and band-pass filters across five frequency bands: Delta (0.5-4 Hz), Theta (4-8 Hz), Alpha (8-14 Hz), Beta (14-30 Hz), and Gamma (30-42 Hz). Mel Frequency Cepstral Coefficients (MFCCs) are applied to process EEG signals received from brain activity, leading to feature extraction for each band with corresponding labels denoting Dyslexia, Attention Deficit, or Autism.
The algorithm employs training and evaluation via cross-validation to establish the target model for prediction. This iterative process involves training the model and analyzing output results across multiple validation cycles defined by iteration values. Hyperparameters are utilized for dyslexic prediction comparison, with metrics such as F1 measure, precision, recall, and overall accuracy assessed using a confusion matrix. Considering potential data imbalance issues affecting accuracy interpretation, Algorithm 1 details the training and cross-validation procedures implemented in this study.
|
Algorithm 1 Training and cross-validation evaluation |
Begin For i =1, 2, ... iteration-cycles Do For a suitable hyperparameters Do Predict suitable cross-validation folds // five cross-validation is the default For each cross-validation fold Do Select suitable classifier Fit and Compile the model with data Store results with the configuration Prepare, Compile, and Store Confusion Matrix End End End Store experimental results according to the precision, recall, and F-measure End |
Early detection of dyslexia is a prominent subject in the upcoming discussions involving experts from neuro and computational sciences. Machine learning models have shown promise in distinguishing dyslexic readers and individuals with reading difficulties through eye movement analysis. To enhance this identification process, a hybrid model combining Random Forest (RF) as a crucial feature with a Support Vector Machine (SVM) classifier has been proposed. This study delves into dyslexia, characterized by challenges in reading and understanding written content.
Our research focuses on analyzing brain signals using Electroencephalography (EEG) channels located in the left hemisphere (FP1, F3, F7, C3, T7, P3, P7, O1) and the right hemisphere (FP2, F4, F8, C4, T8, P4, P8, O2). The brain signals are divided into different bands using "Mel Frequency Cepstral Coefficients (MFCCs)". These bands include the Delta-band (Bd), Theta-band (Bt), Alpha-band (Ba), Beta-band (Bb), and Gamma-band (Bg), ordered from low to high frequencies. By applying thresholds for Theta, Beta/Theta, and Gamma bands to identify dyslexic conditions within each band bi generated by MFCCs analysis.
Following processing of all bands derived from MFCC analysis based on pre-defined thresholds for dyslexia criteria; individuals are categorized into either the dyslexic group di or normal group nj. The algorithm for Band Identification and Classification yields sets of dyslexic and normal participants. The experimental dataset utilized is deemed relatively clean with threshold values adjusted according to noise levels within the dataset.
Algorithm2: Bands Identification and Classification |
Begin Result: D, N the set of dyslexic and normal people D ← {} N ← {} While has next band Do p ← next people band b1, b2, … b5 ← MFCC(p) For bi { b1, b2, … b5} Do If bi satisfies dyslexic-condition, Then d ← argmaxdi {| di ∩ bi | }; If | d bi| > Threshold Then D ← D U {di} Else // don’t satisfies dyslexic (i.e. normal) N ← N U {bi} Endif Endfor Return D, N End While End |
3.2. Dyslexia Prediction Model
A key shared symptom among the five identified bands related to dyslexia is eye tracking issues, such as scattering, attention problems, and confusion during various activities. The proposed approach involves gathering eye-tracking data from both dyslexic and non-dyslexic learners during teaching sessions, followed by the application of machine learning or deep learning algorithms to differentiate between individuals with dyslexia and those without. Attention plays a crucial role in addressing challenges associated with prolonged tasks, particularly in instances where individuals are required to wait before completing a specific assignment. Variances in brain wave signals have been observed between individuals with attention deficit hyperactivity disorder and those without [
19], with studies focusing on evaluating the Theta Beta ratio in EEG brain signals [21-24]. Research indicates that this ratio tends to decrease during experimental observations. In analyzing each participant's EEG signal, the following parameters are taken into account:
The duration of the brain signal recorded during the teaching session for each participant.
The time interval of the received EEG signal (measured in seconds) is set at 120 seconds.
Manual tagging and categorization of each participant's record into one of three groups: dyslexic, attention deficit, or autism.
Examination of an attention deficit parameter within the signal interval for each participant.
3.2.1. Delta Waves
Delta waves represent brain waves that can vary significantly in speed, depending on their frequency measured in hertz (Hz). These slowest brain waves are commonly found in infants and young children and are associated with deep relaxation and rejuvenating sleep. Their presence is often noted in cases of brain traumas, learning difficulties, mental blocks, and severe ADHD symptoms. Inadequate production of delta waves leads to compromised sleep quality and an inability to properly restore both the body and mind. Adequate generation of delta waves supports immune function, natural healing processes, restorative sleep cycles, and a sense of complete rejuvenation.
3.2.2. Theta Waves
Theta waves fall within a frequency range of 4 to 8 Hz and occur when an individual is in a dream-like state or experiences transitions between sleep and wakefulness states. Referred to as the twilight state by some sources [5, 9, 10, 24], theta waves reflect a particular level of consciousness characterized by these oscillations between different states. Theta waves are generally associated with a lack of mental alertness, indicating that the individual is either overly relaxed or unfocused at that specific moment.
3.2.3. Alpha Waves
Alpha waves fall between theta and beta waves on the brain wave spectrum. These waves are typically present in healthy, awake adults when they are resting with their eyes closed, with frequencies ranging from 8 to 13 Hz. Neurologists often measure alpha rhythms at the back of the head while the patient's eyes are shut.
3.2.4. Beta Waves
Beta waves, with frequencies ranging from 14 to 30 Hz, suggest that an individual is concentrating on a specific task. They reflect alertness and focused mental activity. Hans Berger linked beta waves to attentive focus and identified three frequency ranges within beta rhythms: gamma (30–60 Hz), beta 1 (14–20 Hz), and beta 2 (21–30 Hz). Various potential functions of beta rhythms have been proposed, including synchronizing cortical representations, inhibiting movement, and motor planning, maintaining the status, signaling decision-making, and focusing action-selection network functions.
3.2.5. Gamma Waves
The fastest brain waves are gamma waves, with frequencies ranging from 30-42 Hz. These highly precise gamma waves play a key role in assessing people's perception and consciousness during conscious perception processes. They are particularly relevant for memory processes, attention, working memory, and long-term memory, as well as conditions such as schizophrenia, hallucinations, Alzheimer's disease, and epilepsy.
In refining dyslexic diagnoses following initial classification results that may lead to dual outcomes (dyslexic or non-dyslexic autistic), a refinement phase is crucial. The process involves analyzing bands b1 through b5 to identify band bi based on satisfying threshold conditions for Theta and Beta/Theta or Gamma. Classification accuracy is achieved by calculating averages for Theta band values, Beta/Theta ratios, and Gamma band values to derive an overall average parameter (a). In our study, it was determined that a value of a = 1.5 yielded the most optimal outcomes, as not all threshold criteria exceeded a demonstrated significance. Further computations are necessary to explore additional sub-correlations, such as Beta1 and Beta2, as well as Gamma1 and Gamma2. While not definitive in comparison to the threshold, this could potentially serve as an additional parameter for analysis [5, 9, 10, 24].
3.3. Frequencies Bands of Dyslexia and Diagnosis Refinement
Regarding Frequency Bands for Dyslexia, there is evidence indicating that individuals with dyslexia may display variations in neural activity within specific brain frequency bands. Various studies have identified differences associated with dyslexia across different frequency bands, summarized in
Table 1.
Research has indicated that individuals with dyslexia might exhibit reduced beta activity waves in regions linked to language processing and phonological awareness within the left hemisphere of the brain. Similarly, decreased gamma activity has been observed in areas involved in language processing and phonological awareness among those with dyslexia. Additionally, alterations in alpha activity within parietal brain regions responsible for attention and spatial processing have been noted among individuals with dyslexia. Furthermore, an increase in delta activity within temporal brain regions related to auditory information processing has also been reported.
It is essential to acknowledge that these findings are not universally consistent across all studies and can vary depending on specific tasks and conditions utilized in prior research efforts. Further investigations are required to enhance our understanding of the neural mechanisms underlying dyslexia and their implications for reading and language challenges.
Unfortunately, due to differences in research methodologies and focal points regarding dyslexia-related neural correlates in frequency bands, a direct comparison of these studies is unfeasible. The indicators presented in
Table 1 outline the connections between frequencies, bands, activities, brain regions involved, and markers for dyslexia.
As an option, we can offer general information about the research techniques applied to investigate the neural aspects of dyslexia. These methods may encompass functional magnetic resonance imaging (fMRI), magnetoencephalography (MEG), and electroencephalography (EEG). Through these approaches, researchers can observe brain activity in individuals with dyslexia as they engage in language-related tasks like reading or phonological awareness exercises. By scrutinizing patterns of neural activity, researchers can uncover insights into the fundamental mechanisms of dyslexia and how they contrast with typical reading abilities. It is essential to acknowledge that while these methods have their constraints, further investigation is necessary for a more comprehensive understanding of the neural correlates of dyslexia.
Moreover, EEG has the potential to disclose information about the overall brain condition associated with attention. Alpha (8–14 Hz) and Theta (4-8 Hz) frequency ranges have been correlated with activities related to attention. Variations in attention or cognitive involvement may be reflected by alterations in the intensity or synchronization of these frequency bands.
It is noteworthy that not all individuals with dyslexia manifest these characteristics, highlighting the need for additional research to fully grasp the neurological mechanisms at play in dyslexia. Furthermore, ongoing work in this area suggests that future breakthroughs could emerge. The amplitude of Theta and Gamma band signals from both hemispheres during language tasks and periods of rest are illustrated for males and females in
Figure 2.
Despite being a relatively recent area of exploration, there is mounting evidence supporting the applicability of measuring mediation processes using EEG. Given that mediation involves cognitive and emotional processes, EEG can capture neural manifestations of these activities. For example, researchers have utilized EEG to explore cognitive control strategies utilized in mediation such as reaction suppression and conflict monitoring. Event-related potentials linked to conflict monitoring and error detection, like error-related negativity (ERN), can be identified through EEG recordings.
The EEG can provide insights into the psychological aspects of meditation by measuring brain activity related to emotional processing and regulation, such as the late positive potential (LPP) or frontal alpha asymmetry (FAA). Proper experimental design, data analysis methods, and integration with other assessments are crucial for accurate quantification of meditation and attention using EEG. This tool enables a detailed study of the neurological connections to meditation and attention by offering precise assessments of brain activity.
3.4. The Cost Criteria
Analyzing the likelihood of dyslexia helps classify individuals into class one (dyslexic) or class zero (normal) using traditional logistic regression models. The log function is employed in computational calculations, with cost criteria represented by an equation involving summation over all data points. Achieving accuracy in deep learning involves handling balanced or unbalanced datasets, with custom callbacks like ROC or AUC scores used to address dataset imbalances.
In the realm of neural networks' architecture, modern predictors can surpass human experts in detection accuracy through machine learning techniques within AI frameworks. Fine-tuning neural network architectures using Keras and TensorFlow libraries presents a challenge that may require multiple training iterations to optimize model performance. Determining the input layer dimension for neural networks relies on dataset characteristics; therefore, a function is needed to handle this issue efficiently. The optimization of network architecture involves deciding on the number of hidden layers and neurons for optimal performance. For these reasons, individuals may seek to explore their methods, conduct experiments, or adopt established architectural standards.
3.5. EEG Data Additional Processing
This section presents a typical additional function illustration. It outlines eight steps for developing an EEG band processing roadmap. Step 8 focuses on extracting and characterizing cortical alpha rhythm features for the sequence:
1. Load EEG data from the stream file and associate it with a list of channel names and corresponding bands' frequencies in Hertz.
2. Filter and process the data into a pass-band ranging from 1 to 42 Hz based on standard bands' frequencies.
3. Transform the data into a time series showing amplitudes and frequency oscillations within Alpha frequency bands using Theta, Delta, and Alpha bands.
4. Group the processed data into four microstates (A, B, C, and D) sequences using methods like K-means clustering.
5. Analyze the frequencies of the four micro-states to assign labels to each sequence.
- Calculate the distribution of micro-state labels P(x₁ = s)
- Identify label Si ∈ {A, B, C, D}
- Compute empirical peaks per second (PPS)
6. Evaluate Shannon Entropy: H = - ∑P(Xi = Si) log P(Xt =Si)
7. Determine Markov process sequences for cortical Alpha rhythm (alpha = 0.01)
P0= test-Markov0 (x, n_maps, alpha)
P1= test-Markov1 (x, n_maps, alpha)
P2= test-Markov2 (x,n_maps,alpha)
8. Calculate the transition matrix over time-based on a specified time window length (L). Segment the dataset into non-overlapping blocks 'r' and calculate a transition matrix for each block.
P3 = Conditional-Homogeneity-Test (x, n_maps, L, alpha)
3.6. Hidden Layers Architecture
The complexity of deep learning costs often arises from balancing too many features versus insufficient parameters. Consequently, experimental testing is crucial for understanding parameter behaviors accurately when designing hidden layer architectures. This necessitates techniques such as cross-validation network optimization or hyperparameter optimization. Various data processing procedures and analysis methods are employed in deriving EEG measurements related to attention and mediation. Common approaches in EEG research include:
Event-Related Potentials (ERPs): ERPs reflect attentional and cognitive activities through time-locked EEG responses surrounding specific events or stimuli epochs that are averaged to enhance signal-to-noise ratios for ERP assessments. The core computation in ERP analysis is depicted by the following formula:
In this context, ERP(t) refers to the event-related potential at a specific time t. The subtraction of the average signal at the time of interest yields the baseline signal, typically from a pre-stimulus period.
Power spectral analysis examines how power in the EEG signal is distributed across different frequency bands, offering insights into attention-related neural processes. The equation for power spectral analysis is:
Here, PSD(f) represents the power spectral density at frequency f. It is calculated using the squared absolute value of the FFT of the EEG signal at a particular time point.
Connectivity analysis delves into functional connectivity or communication between diverse brain regions through metrics like coherence, phase synchronization, and imaginary coherence. Various connectivity analysis equations exist based on the chosen metric. For example, coherence between two EEG signals at frequency f can be computed using this formula:
In this equation, Spectral Density(X(f)) and Spectral Density(Y(f)) represent the spectral densities of signals X and Y at frequency f respectively, while Cross-Spectral Density(XY(f)) denotes their cross-spectral density.
It is crucial to note that actual analysis methods may vary depending on research objectives, experimental design, and software tools utilized. Specialized software such as EEGLAB and Field-Trip are commonly employed by researchers for these calculations. These formulas provide simplified illustrations of key concepts within EEG analysis. The field comprises numerous advanced techniques beyond these foundational calculations, reflecting its intricate and multidimensional nature.
4. Evaluation and Experimental Results
Assuming no prior knowledge of the original data labeling, a model is created using deep neural networks to differentiate between dyslexic and non-dyslexic brain signals. The model evaluates input signals and provides diagnostic percentage values.
Figure 3 illustrates the proposed detailed framework to analyze the EEG signals during the inputting of the four activities (mental, reading, writing, and math calculation).
In a case study involving MOOC (massive open online course) video watching and mental activity, a dataset from 10 trainers was analyzed. The videos included teaching materials on various scientific subjects like algebra and geometry, with additional videos designed to induce confusion. Dyslexia indicators such as confusion, attention, spelling errors, reading mistakes, and disorientation were identified. Data sourced from Kaggle was used to study brain electrical activity for demographic information about patients. The objective is to develop a deep neural network capable of predicting dyslexic indicators based on brain signal features. A reference classifier in
Figure 4 outlines dimensions influencing early dyslexia diagnosis based on behaviors observed during MOOC video watching. Measurement indicators for each behavior are specified within the framework.
The experiment employed six hidden layers due to complex correlations between dyslexia indicators and EEG signal features. The network architecture consisted of 512 neurons in the first layer gradually reducing to 16 neurons before the output decision using a binary classifier. The sigmoid activation function constrained network outputs between Zero and One.
4.1. Case 1: MOOC Watching
The dataset included EEG signals from 12812 participants during MOOC watching for training purposes.
Table 2 presents data from the first 20 participants, including learner ID, video ID, attention levels, mediation indicators, and different brainwave bands along with predefined labels indicating dyslexic or non-dyslexic status. Participant identifiers and labels were anonymized for privacy reasons using color codes.
According to
Table 3, learners 1 and 10 have been identified as dyslexic students due to attention-related issues, contrasting with learners 2, 3, 4, 5, 6, 7, 8, and 9. Therefore, an examination of signal analysis, classification, and their interrelations will be conducted. An expert has diagnosed learner 5 with dyslexic behavior and observed that this learner exhibited higher Theta and Theta-Beta ratios compared to others.
In the second test group, which consists of the last 10 students in the dataset, three individuals exhibit signs of dyslexia. Specifically, student #4, #5, and #10 displayed Theta wave values of 148692, 192887, and 1184366 respectively, indicating potentially higher levels compared to their peers. Notably, student #10 demonstrates potential dyslexic traits based on their Theta and Theta/Beta 1 parameter values, suggesting heightened attention levels. Additionally, students #4 and #5 also show positive attention indicators. Conversely, students #1, #2, #3, and #6 exhibit positive mediation indications (as shown in
Table 4).
The framework proposed utilizes accuracy, precision, recall, and F1-Score as performance metrics. A threshold metric value is then calculated to enhance performance beyond these measures. The "Receive Operating Characteristic (ROC)" method is employed to determine the optimal threshold value by maximizing the True Positive Rate (TPR) and minimizing False Positive Rate (FPR).
to be maximum, and
to be minimum
The ROC curves and AUC scores are utilized to determine the most effective model for categorizing the suggested framework. The interpretation of ROC curves is illustrated in
Figure 5. A custom callback within the Keras library is employed to compute the ROC curve and assess the AUC score while training the proposed model. The calculation of the correlation coefficient by the computational model necessitates an examination of the calculation method to comprehend its significance, which is formulated as follows: The numerator in this formula represents the summation of the product of the difference between each value of the first parameter and its mean with that of the second parameter and its mean. The denominator comprises the product of each variable's standard deviation multiplied by the number of data samples.
4.2. Case 2: Eye Tracking during Reading Activity
The second case study involved 89 participants in the control group and 88 participants in the dyslexic group. All individuals were instructed to read the same text paragraph, while their eye-tracking saccades and fixations were recorded for classification analysis. The classification process utilized K-means and SVM algorithms. The data collected included x-y coordinates from both the left and right eyes, leading to the observation of vectors with specific parameters:
- -
Lx: x coordinate of the left eye
- -
Ly: y coordinate of the left eye
- -
Rx: x coordinate of the right eye
- -
Ry: y coordinate of the right eye
In cases where discrepancies existed between data lengths from the left and right eyes, adjustments were made to equalize vector lengths before proceeding with the classification process as illustrated in
Figure 6.
Eye tracking, unlike traditional screening methods that rely on verbal responses, offers an objective way to evaluate the reading process in real time without requiring the subject to provide a spoken or written assessment. This study's dataset includes 97 high-risk individuals with early word decoding difficulties and 88 low-risk individuals selected from a larger group of 2,165 second-grade schoolchildren. By utilizing predictive modeling and statistical resampling techniques, classification models were developed using eye-tracking data of less than one minute, demonstrating high accuracy in distinguishing between high-risk and low-risk subjects. While dyslexia is primarily language-based, the findings indicate that eye movements during reading can be indicative of individual reading ability, making eye tracking a valuable tool for identifying children at risk for long-term reading challenges.
The study conducted as part of the Kronoberg Reading Development Project involved 185 subjects (Swedish school children between 1989 and 2010) from an initial group of 2,165 second-grade students. Among them, 103 individuals were identified in 1989 as having difficulty developing word reading skills at a normal rate. This high-risk group consisted of 82 males (7.7% of all males) and 21 females (1% of all females), meeting specific inclusion criteria such as being native Swedish speakers, scoring in the lower fifth percentile on standardized word decoding tests, and receiving confirmation of persistent reading difficulties from their teachers. Individuals with intellectual disabilities were excluded from the selection process. The primary concern that needs attention is the varying lengths of data. Alternative approaches entail selecting a fixed number of features (n), resulting in an n-dimensional dataset where each dimension signifies a feature. This process necessitates a thorough grasp of the domain, considering multiple assumptions regarding feature selection. In cases of incomplete domain knowledge, crucial aspects of the signal may be overlooked during feature selection. When opting to process the raw signal, equalizing lengths involves either padding or data interpolation. However, augmenting temporal data with information can alter the significance of temporal and spectral properties within the signal. Frequency transformation is applied to time series data, typically preserving vital information within the dataset but also leading to varying signal lengths in the spectral domain. Modifying signals in the frequency domain permits adjustments without affecting time characteristics. To prevent signal distortion, a sliding window approach can be adopted, creating frequency groups or bins across the signal while maintaining shared fractions between neighboring bins to capture specific time-domain aspects without explicitly including or excluding features. By employing an equal number of bins throughout the signal and retaining consistent proportions of shared frequencies between adjacent bins, specific time-domain characteristics can be captured systematically without arbitrary decisions on feature inclusion or exclusion, resulting in uncompressed data condensed into a sum.
This study focuses on extracting saccades and fixation features for eyes using supervised learning and SVM classification techniques with a concentration on exact coordinate locations of both left and right eyes during reading tasks where all participants read identical lines of text. Discrepancies between left and right eye movements are visualized through yellow and red lines representing the readings from each eye as shown in
Figure 6. The orange plotted line depicts the combination of two lines superimposed on each other, with some red and yellow lines not intersecting. This visualization showcases the readings from the left eye (yellow) and right eye (red). Individual participant data can be modified by representing each participant in a distinct vector, resulting in varying vector lengths or inconsistencies. Consequently, the K-means method is utilized for classification post-normalization, achieved through adding zeros at the end of vectors or averaging all vector values. This process leads to an enhanced accuracy score (improved from 69.35% to 72.11% and further to 73.12%), facilitating the visualization of a confusion matrix. The confusion matrix indicates that shorter vectors are associated with normal candidates while longer ones correspond to dyslexic individuals.
To enhance model accuracy, a Short-Time Fourier Transform (STFT) is employed to ascertain the standard average frequency and phase content of each participant's signal as it evolves. In this context, the lengthy time signal is segmented into equal shorter segments, followed by STFT computation for each segment using the formula:
Here, w(Ⴀ) denotes the window function (Hann or Gaussian), s(t) represents the signal for each participant (dyslexic or normal), and S(Ⴀ, w) stands for the Fourier transform of s(t) w(t-Ⴀ), reflecting the phase and magnitude of the participant's signal over time post computing average power spectra. Ⴀ and w represent the time and frequency axes respectively.
Figure 7 portrays a classification accuracy rate of 96.24% after dividing the time signal into four equal segments.
4.3. Case 3: Opening and Closing Eye Tracking Activity
The EEG biosignal is captured by monitoring the two states of the eyes - opening and closing. When the eyes open or close, brain neurons are activated, producing EEG time-series patterns. A machine/deep learning model has been proposed to recognize and classify these patterns. This model processes the EEG signal to determine whether a person's eyes are open or closed using supervised machine/deep learning techniques. The first method involves statistical feature extraction to identify features like mean, standard deviation, skewness, Kurtosis, Band power, etc., which are then used for training, classification, and prediction. The second approach employs a Neural Network: Multilayer Perceptron’s by utilizing 14 electrode values for training and classification. The third method utilizes the Recurrent Neural Network: LSTM network to train on EEG signals and predict the output based on the last layer ('0' for opening and '1' for closing). Four performance metrics are utilized to assess these models. An illustration in
Figure 1 showcases how the EEG technique can be utilized for monitoring and identifying brain activities.
The analysis is centered around 14 EEG values representing 14 electrodes capturing statistical information related to both eye-opening and eye-closing activities. Additional statistical computations such as minimum, maximum, mean, and standard deviation are conducted. Consequently,
Figure 5 evaluates the time series data concerning eye-opening and eye-closing activities. Furthermore,
Table 5 presents a comparative assessment of the four methods used to predict eye activities associated with opening and closing.
A research paper titled "Predicting risk of dyslexia with an online gamified test" was published in PLOS ONE, detailing the second case study. The dataset consisted of more than 3600 participants (desktop data), with the study successfully identifying over 80% of individuals with dyslexia. Subsequently, a new dataset comprising over 1300 cases (tablet data) was used to test the proposed model. The evaluation accuracy indicated a recall rate exceeding 78% for identifying dyslexia in children aged 12 years and above. This research demonstrates that machine learning techniques, as implemented in our new model, can effectively observe and detect dyslexia. A comparative evaluation of the proposed dataset is presented in
Table 6.
Based on the findings of the analysis, a strong relationship exists between Attention and Mediation, while a weaker connection is observed among Attention, Mediation, and Theta.
When examining the EEG data of participants engaged in educational video sessions designed for teaching and training purposes:
The EEG frequencies of participants can be evaluated across different spectrum categories such as Delta (1-4 Hz), Theta (4-8 Hz), Alpha (8-14 Hz), Beta (14-30 Hz), and Gamma (30-42 Hz).
- 5.
This data can be utilized to anticipate levels of confusion, attention, or mental focus among individuals, thereby enabling the prediction of confused students.
- 6.
Various correlation parameters between frequency bands can be studied to forecast and identify specific characteristics related to attention and mediation.
- 7.
By analyzing the correlation parameters across frequency bands within a sample containing numerous dyslexic participants, it may be feasible to diagnose dyslexia.
- 8.
With a substantial dataset comprising dyslexic participants for training and testing purposes, it becomes possible to establish thresholds for feature selection leading to early diagnosis based on signals from Attention, Mediation, Theta, Beta, and Gamma.
In a study conducted by Keshavarzi et al. (2022) [
28], it was highlighted that the Delta band holds more significance compared to Theta and Alpha bands. Their research revealed notable differences in neural responses within the Delta band between individuals with dyslexia and those without during "neural entrainment to rhythmic audiovisual speech", potentially presenting new avenues for diagnosis or intervention.
Several datasets have been examined to scrutinize neurophysiological mechanisms aiding in the early detection of individuals with disabilities. Some datasets explore emotions—both positive and negative—in identifying frequency bands linked to learning disorders. Another dataset focuses on impaired speech signals associated with dyslexia within the Delta band frequency.