Preprint
Article

This version is not peer-reviewed.

Phenotype Correlations of Neurological Manifestations in Wolfram Syndrome: Predictive Modeling in a Spanish Cohort

A peer-reviewed article of this preprint also exists.

Submitted:

27 October 2025

Posted:

29 October 2025

You are already at the latest version

Abstract

Background: Wolfram syndrome (WS) is an ultrarrare genetic disorder caused by pathogenic variants in the WFS1 gene, combining endocrine and neurological involvement, leading to progressive neurological, autonomic, and cognitive impairment. Predicting neurological progression remains a clinical challenge, particularly in relation to genotype. Methods: Forty-five genetically confirmed patients with WS were followed in Spain between 1998 and 2024. Genetic variants were classified by exon, zygosity, and predicted wolframin production (Classes 0–3). Machine learning algorithms, including Random Forest models with gene–gene interaction terms, were applied to identify the strongest predictors of neurological involvement and to stratify phenotypic severity. Results: The most prevalent neurological signs were absence of gag reflex (67%), gait instability (64%), and dysphagia (60%), typically emerging in the third decade of life Homozygosity for truncating variants—especially c.409_424dup16 (Val142fsX110)—with no wolframin protein (Class 0) were the main predictors of early and severe neurological impairment. Machine learning models achieved accuracies between 88.9% and 93.3%, with wolframin class and allele 2 mutation ranking as top features. Conclusions: Integrating genetic and clinical data through machine learning enables robust prediction of neurological outcomes in WS. This approach enhances precision diagnosis and provides a framework for individualized monitoring of rare neuroendocrine disorders.

Keywords: 
;  ;  ;  ;  ;  ;  ;  

1. Introduction

Wolfram syndrome (WS), first described in 1938 by Wolfram and Wagener under the acronym DIDMOAD (diabetes insipidus, diabetes mellitus, optic atrophy, and deafness), is a rare, progressive disease with a prevalence estimated between 1 in 55,000 and 1 in 770,000 live births, depending on population genetics and consanguinity rates [1,2]. WS is inherited primarily in an autosomal recessive pattern, although autosomal dominant and sporadic cases have also been reported [3]. Most cases (WS type 1) result from biallelic pathogenic variants in WFS1 on chromosome 4p16.1, which encodes wolframin, an 890-amino-acid endoplasmic reticulum (ER) transmembrane glycoprotein involved in ER homeostasis, calcium signaling, redox balance, and apoptosis [4,5]. Loss-of-function variants—such as nonsense, frameshift, and large deletions—lead to absent or dysfunctional wolframin, triggering chronic ER stress, β-cell failure, and progressive neurodegeneration [5]. More than 200 different WFS1 mutations have been reported, most clustering in exon 8, which encodes the transmembrane and C-terminal domains [3]. Truncating mutations, particularly in exons 4 and 8, have been associated with earlier onset of diabetes and optic atrophy, as well as more severe neurological manifestations [6]. A rarer subtype, Wolfram syndrome type 2 (WS2), results from biallelic variants in CISD2 on chromosome 4q22–q24. WS2 shares overlapping features with WS1 but typically lacks diabetes insipidus and may present gastrointestinal ulceration, bleeding tendency, and sensorimotor neuropathy [2]. These molecular distinctions broaden the phenotypic spectrum of WS and highlight the role of endoplasmic reticulum stress and mitochondrial dysfunction as shared pathogenic pathways.
Clinically, WS typically appears in early childhood with insulin-dependent diabetes mellitus before age 6, followed by optic atrophy around age 11 [1]. Sensorineural hearing loss usually develops during adolescence, and diabetes insipidus in the second decade of life. Neurological manifestations—including cerebellar ataxia, gait instability, dysarthria, dysphagia, dysmetria, peripheral neuropathy, anosmia, psychiatric disturbances, and progressive cognitive decline—often emerge in the third to fourth decade and are major determinants of morbidity and mortality [2,7]. Brainstem dysfunction, such as absence of gag reflex, sialorrhea, and abnormal respiratory patterns, contributes significantly to aspiration pneumonia and premature death [1].
Neuroimaging findings have reinforced WS as a primary neurodegenerative disorder. MRI volumetric studies show early involvement of the brainstem and cerebellum, with marked atrophy of the pons, medulla, and cerebellar hemispheres even in pediatric patients [8,9]. Diffusion tensor imaging (DTI) reveals widespread microstructural abnormalities in corticospinal tracts, cerebellar peduncles, and optic radiations, suggesting early axonal degeneration [10]. Longitudinal MRI studies confirm that brainstem and cerebellar atrophy progress over time and correlate with gait instability, dysarthria, and ataxia [9]. Later cortical and hippocampal involvement have been reported, correlating with cognitive and psychiatric symptoms [8]. Strong genotype–phenotype correlations have been identified. Homozygosity for truncating variants, particularly c.409_424dup16 (Val142fsX110), is overrepresented among patients with severe neurological phenotypes, including early gait instability, absence of gag reflex, and cognitive decline [6]. Frameshift and nonsense variants producing a “wolframin class 0” phenotype (complete loss of protein) is linked to earlier onset of diabetes, optic atrophy, and more rapid neurological progression compared to missense variants that may retain partial protein activity [3]. Despite these insights, longitudinal data quantifying the prevalence, age of onset, and genetic predictors of specific neurological manifestations remain limited. Furthermore, few studies have integrated machine learning or predictive models based on functional classification of wolframin to estimate individual risk and disease trajectories.
In the present study, we analyze a large Spanish cohort of 45 genetically confirmed WS patients, classifying WFS1 variants by type (truncating vs. missense), zygosity (homozygous vs. compound heterozygous), and predicted functional class, and correlating them with neurological features such as dysphagia, dysmetria, gait instability, ataxia, anosmia, cognitive decline, and absence of gag reflex. We further apply machine learning techniques to improve risk prediction, define temporal patterns of neurological involvement, and guide early intervention strategies aimed at preventing life-threatening complications such as aspiration pneumonia.

2. Materials and Methods

2.1. Study Design and Participants

The database was structured into three main components: (i) epidemiological variables, (ii) monitored genetic attributes and their potential interactions, and (iii) symptom class vectors. Clinical and genetic data were collected as part of a longitudinal follow-up initiated in 1998 under the supervision of the principal investigator, who also has coordinated the research team. During the early years, information was obtained from medical reports and clinical documentation provided by hospitals across Spain, directly reviewed and clinically assessed by the principal investigator. Since September 2010, patients have been directly evaluated by a multidisciplinary team specialized in Wolfram syndrome, integrated within the Almería Health District, with coordinated activity across several facilities of the Andalusian public health system, including Hospital La Inmaculada (Huércal-Overa) and, more recently, Hospital Universitario Torrecárdenas. The same principal investigator has continued the systematic collection, evaluation, and updating of the database, ensuring methodological consistency and data reliability throughout the study period.

2.2. Clinical and Cognitive Assessment

Cognitive function was assessed using age-adjusted Wechsler scales (WAIS/WISC). Neurological examinations systematically evaluated brainstem, cerebellar, cortical, and peripheral domains. Each clinical manifestation—such as gait instability, dysphagia, dysmetria, or anosmia—was encoded as a binary variable (present/absent) to facilitate computational modeling.

2.3. Genetic Analysis and Functional Classification

Genetic variables included variant type (truncating, missense, deletion, duplication), exon position, and zygosity (homozygous or compound heterozygous). Each variant was categorized into one of four wolframin functional classes (0–3), reflecting the predicted level of protein expression and functionality:
  • ▪ Class 0: No wolframin production (e.g., homozygous nonsense or frameshift variants).
  • ▪ Class 1–3: Gradual preservation of partial function according to in silico and literature-based predictions.

2.4. Machine Learning Model

To predict neurological outcomes based on genetic features, a Random Forest (RF) model was implemented using Python 3.9. The model incorporated both main effects and pairwise interaction terms between genetic variables to explore potential synergistic effects influencing neurological expression. Model performance was evaluated through 80/20 cross-validation, using accuracy, precision, and recall as performance metrics. Feature importance was estimated using the mean decrease in Gini impurity, allowing for ranking of the most predictive genetic attributes. The analytical framework followed Breiman’s Random Forest methodology [13] and Kuhn and Johnson’s applied predictive modeling principles [14].

3. Results

3.1. Epidemiology

Table 1 shows the demographic profile and genetic confirmation rate of the 45 patients fulfilling the clinical criteria for Wolfram syndrome who were followed longitudinally in Spain between 1998 and 2024. Most patients were of white ethnicity, and genetic confirmation was available in 95.6% of cases. One quarter of patients were born from consanguineous parents, and over one third had affected siblings. The mean age of the surviving cohort in 2024 was 27.5 years, consistent with the natural history reported in previous European cohorts. [1,3]

3.2. Cognitive and Neurological Manifestations

Neurological and cognitive manifestations were systematically assessed. Cognitive function was evaluated using the Wechsler Adult Intelligence Scale (WAIS) and, for pediatric participants, the Wechsler Intelligence Scale for Children (WISC). These standardized instruments assess global and domain-specific intelligence (verbal comprehension, working memory, and processing speed). Administration was performed by trained professionals and required approximately 60–90 minutes. IQ categories were defined as follows:
  • ▪ Superior intelligence: ≥130.
  • ▪ High/above average intelligence: 115–129.
  • ▪ Average intelligence: 85–114.
  • ▪ Low intelligence (borderline or below average): 70–84
  • ▪ Intellectual disability: ≤69
Neurological examination systematically covered brainstem, cerebellar, cortical, and peripheral functions, including gait instability, dysmetria, dysphagia, absence of gag reflex, sialorrhea, tremor, and altered deep-tendon reflexes. Each clinical manifestation was encoded as a binary variable (present/absent) for analytical purposes.

3.3. The Genetic Attributes

A clear understanding of the parameters and distribution of genetic variables among patients is essential for the appropriate interpretation of the results and for assessing their correlation with the neurological disorders analyzed in Wolfram syndrome. However, the limited availability of comprehensive genetic data poses a significant challenge for both modeling and clinical interpretation. Table 2 presents the genetic status of Spanish patients with Wolfram syndrome. Wolframin, encoded by the WFS1 gene, plays a critical role in maintaining endoplasmic reticulum (ER) homeostasis, regulating intracellular calcium levels, and protecting neurons and the β-cells of the pancreas under stress conditions. Its deficiency is associated with β-cell failure (diabetes mellitus) and progressive neurodegeneration. (including the vision and the audition.
Patients were classified into four groups according to the predicted wolframin protein production and functionality:
  • ▪ Code 0: No wolframin production (both alleles carrying nonsense mutations, deletions, duplications, or splice defects leading to premature termination of mRNA translation).
  • ▪ Code 1: Partial wolframin production; compound heterozygotes with one missense and one truncating variant, resulting in approximately half the normal protein level but largely dysfunctional due to misfolding.
  • ▪ Code 2: Both alleles carry missense mutations generating misfolded proteins with partial or absent function.
  • ▪ Code 3: Heterozygous condition producing half normal and half misfolded protein, typically linked to a milder or dominant phenotype.
In our database, 48.9% of individuals were heterozygous carriers (Table 2)
Table 3 and Table 4 summarize the WFS1 mutations identified at the cDNA and protein levels, respectively. These genetic variants were later incorporated into the predictive model to explore their relationship with the neurological manifestations observed in the cohort.
Each patient’s WFS1 mutations were further classified according to exon location and predicted functional impact. For each allele, categorical variables were generated to indicate whether the mutation was located in exon 4 or exon 8 (mut1_exon4_class and mut2_exon8_class). A combined variable (mut12_exon_class) indicated whether both mutations were in exon 4 (class 1), both in exon 8 (class 2), or located in different exons (class 3).
In addition, variables mut1_protein_class and mut2_protein_class were defined to characterize the protein effect of each mutation, and genetic_condition_class categorized patients as homozygous, compound heterozygous, or triple heterozygous. Wolframin production was stratified into four classes based on predicted protein expression and function: class 0 (absent protein due to premature stop codon), class 1 (~50% production but likely misfolded), class 2 (amino acid substitutions on both alleles resulting in misfolded protein), and class 3 (autosomal dominant heterozygotes with ~50% normal protein production).
To investigate potential epistatic and synergistic effects, interaction terms combining wolframin class, exon location, and genetic condition were created. These variables were used as interaction predictors in the subsequent machine learning model to capture non-linear genotype–phenotype relationships.
All the genetic variables used in this research and their interactions are shown in Table 5.

3.4. Genetic–Clinical Correlation Analysis

Clinical signs were cross-analyzed with genetic alterations using the Random Forest (RF) algorithm (Breiman, 2001), an ensemble learning method for both classification and regression tasks. RF builds multiple decision trees and combines their outputs, enhancing prediction accuracy, reducing overfitting, and enabling estimation of prediction uncertainty.
To identify the most relevant predictors, we applied the Out-of-Bag (OOB) feature importance algorithm (Kuhn and Johnson, 2013), which leverages data samples not used during tree training as an internal validation set. By measuring the change in model accuracy after random permutation of each feature in the OOB data, we obtained a robust estimate of each variable’s contribution to the outcome, without requiring a separate test dataset.
The objective was two-fold:
1.
To train classifiers predicting neurological symptoms based on genetic variables and their interactions.
2.
To rank these variables according to their relative importance in the development of each neurological manifestation.
This approach facilitates a better understanding of the genetic mechanisms underlying Wolfram syndrome and their relationship to clinical outcomes. Machine learning methodologies thus provide powerful tools to uncover genotype–phenotype links and to support data-driven clinical decision-making.

3.5. Key Clinical-Genetic Findings in Wolfram Syndrome

Neurological symptoms were present in all 45 patients, except for three individuals (mean age: 14 years, SD: 1.63) who showed no neurological impairment. Regarding cognitive function, most patients exhibited an average-to-high IQ. Only 2.2% (n = 1) presented intellectual disability due to developmental delay, 11.1% (n = 5) had low IQ, 40% (n = 18) had normal IQ, and 46.6% (n = 21) had high or superior IQ.
Table 6 summarizes the key clinical–genetic findings in Wolfram syndrome. The dataset shows that the most prevalent neurological features are absence of gag reflex (67%), gait instability (64%), and dysphagia (60%), indicating that core motor and autonomic dysfunctions are frequent in this cohort. In contrast, cognitive decline (36%), anosmia (40%), and dysmetria (44%) are less frequent, suggesting that cognitive or sensory symptoms emerge later or affect fewer individuals.
Most neurological manifestations begin during early to mid-adulthood (23–26 years), whereas cognitive decline occurs later (mean onset: 29.9 years, higher variability). This temporal pattern of neurological impact reflects a progressive sequence from motor and autonomic dysfunction to cognitive involvement.
There is a notable sex imbalance across neurological phenotypes: men represent 60–67% of cases for dysphagia, dysmetria, gait instability, and ataxia. Although women comprise 44.5% of the overall cohort, their representation within symptom groups remains lower, suggesting possible sex-linked penetrance effects or reporting bias. Some features, such as absence of gag reflex (53.3% male), show a nearly equal sex distribution, indicating that certain neurological deficits are not sex-dependent.
Genetically, homozygosity for pathogenic WFS1 variants was overrepresented across all symptom groups (≥62%), reaching 81% in cognitive decline and 75% in ataxia. These results reinforce the idea that biallelic truncating mutations, particularly ex4 c.409_424dup16 (Val142fsX110), drive the most severe neurological phenotypes. The wolframin Type 0 phenotype, defined by complete absence of protein production, correlates strongly with disease severity, with prevalence ranging from 66% to 83% across neurological manifestations.
This supports a mechanistic model in which loss of functional wolframin leads to progressive neurodegeneration, particularly affecting cerebellar and brainstem circuits. The Val142fsX110 mutation emerges as the dominant variant across all symptom categories, highlighting its central pathogenic role. Other variants, such as Trp371X and Val142fs251, were associated with specific symptoms, including absence of gag reflex and anosmia.
Overall, the high homozygosity rates indicate that most symptomatic individuals carry homozygous pathogenic variants, and all major symptoms correlate with Type 0 wolframin (67–83% prevalence). Higher prevalence of Type 0 protein loss is associated with earlier onset and greater motor and sensory impairment, reinforcing the genotype–phenotype correlation.
The data delineate a progressive neurodegenerative trajectory:
Early manifestations (≈23–24 years): dysphagia, sialorrhea, absent gag reflex, and dysmetria, primarily reflecting motor and autonomic dysfunction.
Later manifestations (26–30 years): gait instability, ataxia, anosmia, and cognitive decline, corresponding to sensory and cognitive system involvement.
Core motor and autonomic symptoms are more frequent and earlier, while cognitive and sensory symptoms, though less prevalent, emerge later in disease progression. A small set of high-impact mutations, dominated by Val142fsX110, drives most observed phenotypes. Protein-class predictors further enhance symptom prediction, and Type 0 wolframin defects are the key determinants of disease severity.
The pattern demonstrates a progressive sequence of neurological involvement, with a slight male predominance, beginning with motor and autonomic manifestations and advancing to sensory and cognitive impairment over time.

3.6. Machine Learning Model Performance and Feature Importance

Table 7 summarizes the performance of the machine learning models and the key features driving neurological outcomes in Wolfram syndrome. Across all phenotypes, the Random Forest models outperformed naïve classification by 25–40 percentage points, demonstrating robust reliability for clinical prediction. Mut2_protein_class consistently ranked as the most influential feature (22–30%), highlighting allele 2 as the primary determinant of phenotype severity. Interaction terms such as prod_wm12 and prod_mgm12 systematically improved model fit, underscoring the importance of combinatorial modeling rather than single-variant interpretation. Protein-level predictors repeatedly emerged as dominant contributors, indicating that functional protein classes provide stronger predictive power than individual variants alone. The highest model accuracies (≥93%) were observed for absence of gag reflex and ataxia, suggesting that these are the most genetically driven neurological phenotypes.These findings support the integration of genetic classifiers into clinical decision-support tools, enabling neurologists to anticipate early dysphagia, airway compromise, and gait impairment—major contributors to morbidity and mortality in Wolfram syndrome. These results emphasize that combining genetic data (mutation type, zygosity, wolframin class) with clinical features provides a robust framework for risk stratification, precision prognosis, and surveillance of neurodegenerative complications in Wolfram syndrome.

4. Discussion

This study provides new insights into the genotype–phenotype relationships in Wolfram syndrome (WS) by integrating comprehensive genetic characterization with machine learning–based predictive modeling. Our findings demonstrate that functional impairment of wolframin, rather than mutation type alone, is the strongest determinant of neurological severity and disease progression.

4.1. Genetic Mechanisms and Phenotypic Expression

Patients carrying biallelic truncating variants—particularly the ex4 c.409_424dup16 (Val142fsX110) mutation—showed the most severe and early-onset neurological manifestations, including dysphagia, dysmetria, and gait instability. These results align with previous studies reporting progressive brainstem and cerebellar atrophy as major hallmarks of WS [7,8,9,10].The predominance of Class 0 wolframin, defined by complete absence of protein production, supports a mechanistic model in which ER stress and calcium dysregulation trigger selective neuronal vulnerability. The temporal pattern observed—motor and autonomic dysfunction in the early 20s, followed by sensory and cognitive decline—suggests a sequential neurodegenerative cascade consistent with the progressive nature of the syndrome.

4.2. Predictive Modeling and Variable Importance

The Random Forest (RF) models achieved 25–40 percentage-point improvements over naïve classification across neurological outcomes, underscoring the strength of ensemble learning approaches for rare-disease prediction. Among all predictors, mut2_protein_class emerged as the most influential variable (22–30% importance), indicating that the second allele critically modulates phenotype expression through additive or compensatory effects. Furthermore, the introduction of interaction terms—combining wolframin production, exon location, and zygosity—significantly improved model fit, capturing non-linear and epistatic relationships often missed by conventional analyses. The ability of AI algorithms to identify complex genetic–phenotypic interactions represents a major methodological advance, offering earlier and more accurate risk stratification for clinical use.

4.3. Clinical and Translational Implications

The RF models achieved the highest accuracies (≥93%) for absence of gag reflex and ataxia, indicating that these are the most tightly linked phenotypes to WFS1-related neurodegeneration. Integrating these predictive variables into clinical decision-support tools may help clinicians anticipate dysphagia, airway compromise, and gait instability, which are key contributors to morbidity and mortality in WS. From a translational perspective, predictive modeling can guide surveillance protocols by identifying patients at highest risk for early neurological deterioration. This approach also provides a reproducible framework for evaluating the pathogenic potential of newly identified WFS1 variants and could inform patient selection and outcome prediction in interventional or neuroprotective trials.

4.4. Limitations and Future Perspectives

The main limitations of this study are the small cohort size (n = 45) and sex imbalance, both inherent to WS rarity. Additionally, its cross-sectional design limits causal inference.Future research should aim to validate these models in multicenter, longitudinal cohorts, incorporating multimodal data such as EEG, MRI volumetrics, and molecular biomarkers. Such integration would further clarify how ER-stress-mediated cellular dysfunction translates into clinical progression and may optimize the predictive accuracy of AI-based frameworks.
Each patient’s WFS1 mutations were further classified according to exon location and predicted functional impact. For each allele, categorical variables were generated to indicate whether the mutation was located in exon 4 or exon 8 (mut1_exon_class and mut2_exon_class). A combined variable (mut12_exon_class) indicated whether both mutations were in exon 4 (class 1), both in exon 8 (class 2), or located in different exons (class 3).
In addition, variables mut1_protein_class and mut2_protein_class were defined to characterize the protein effect of each mutation, and genetic_condition_class categorized patients as homozygous, compound heterozygous, or triple heterozygous. Wolframin production was stratified into four classes based on predicted protein expression and function: class 0 (absent protein due to premature stop codon), class 1 (~50% production but likely misfolded), class 2 (amino acid substitutions on both alleles resulting in misfolded protein), and class 3 (autosomal dominant heterozygotes with ~50% normal protein production).
In summary, this research shows that functional classification of wolframin and allele interaction modeling can robustly predict neurological outcomes in WS. By linking molecular dysfunction to clinical expression through machine learning, we provide a foundation for precision medicine approaches in this complex neurodegenerative disorder.

5. Conclusion

Integrating detailed genetic characterization with Random Forest machine learning models enables accurate prediction of neurological outcomes in Wolfram syndrome. Functional wolframin classification (Types 0–3) and allelic interactions were the strongest predictors of disease severity. This framework bridges genetics, bioinformatics, and clinical neurology, offering a path toward precision prognosis and individualized follow-up in WFS1-related disorders. Future expansion of these models with longitudinal and multimodal data will enhance their translational potential and support personalized therapeutic strategies.

Author Contributions Statement

• G.E.-B.: Designed the study, established the clinical framework, and collected all clinical variables. She led the manuscript writing, conducted statistical analyses on selected variables, and ensured the clinical relevance of the findings. Additionally, she performed the literature search and referenced sources according to Vancouver style guidelines. She also supervised the overall study design and integration of multidisciplinary data. • J.L.F.-M.: Assisted in the revision and improvement of the manuscript, reviewed the statistical analyses performed by G.E.-B., and conducted statistical analyses of complex variables, contributing to methodological validation. As an expert in mathematical modeling and biostatistics, he provided guidance in statistical modeling, data interpretation, and quantitative methods applied to medicine. • M.L.B.-C.: Contributed to the genetic data interpretation, reviewed the genetic and molecular aspects of the manuscript, and ensured the scientific accuracy of the biological sections. • All authors: Reviewed and approved the final version of the manuscript in both Spanish and English.

Funding

This study was supported by the Andalusian Regional Ministry of Health (Consejería de Salud de Andalucía) through the grant AP-0009-2020-C1-F2 FPS 2020 – Research Call for Primary Care, Regional Hospitals, and CHARES. Additional institutional support was provided by the “SAS 2024 – Reinforcement of Research Activity in Clinical Units of the Andalusian Health Service”, which allocated dedicated time for scientific research and contributed to the development of this study.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and approved by the Ethics Committee of Torrecárdenas University Hospital (protocol code 75/2020; approval date 27 February 2020).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study. Written informed consent was obtained from all patients for the publication of this paper.

Data Availability Statement

The data supporting the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

The authors express their sincere gratitude to the families affected by Wolfram Syndrome in Spain and Portugal, whose collaboration and trust made this study possible.
We also acknowledge the Spanish Association for Research and Support to Wolfram Syndrome (Asociación Española para la Investigación y Ayuda al Síndrome de Wolfram) for its continued support. The authors thank Dr. Javier Ruiz Martínez, neurologist, for having provided early guidance to the principal investigator on the systematic neurological assessment of patients with Wolfram Syndrome.

Conflicts Of Interest

The authors declare no conflict of interest.

References

  1. Barrett TG, Bundey SE, Macleod AF. Neurodegeneration and diabetes: UK nationwide study of Wolfram (DIDMOAD) syndrome. Lancet. 1995;346(8988):1458–1463.
  2. Rigoli L, Lombardo F, Di Bella C. Wolfram syndrome and WFS1 gene. Clin Genet. 2018;93(1):3–14.
  3. Urano F. Wolfram Syndrome: Diagnosis, management, and treatment. Curr Diab Rep. 2016;16(1):6. [CrossRef]
  4. Inoue H, Tanizawa Y, Wasson J, Behn P, Kalidas K, Bernal-Mizrachi E, et al. A gene encoding a transmembrane protein is mutated in patients with diabetes mellitus and optic atrophy (Wolfram syndrome). Nat Genet. 1998;20(2):143–148.
  5. Fonseca SG, Urano F, Burcin M, Gromada J. Wolfram syndrome 1 and adenylyl cyclase 8 interact at the endoplasmic reticulum: a possible mechanism for disrupted calcium signaling in the β-cell. J Biol Chem. 2010;285(47):37902–37910.
  6. Cano A, Esteban-Bueno G, et al. Genetic and clinical characterization of Spanish patients with Wolfram syndrome: genotype–phenotype correlations. Diagnostics. 2021;11(12):2278. [CrossRef]
  7. Antenora A, et al. The wide spectrum of neurological involvement in Wolfram syndrome: clinical, neurophysiological, and MRI study. Orphanet J Rare Dis. 2016;11(1):88.
  8. Shaw CJ, et al. Progressive brainstem and cerebellar atrophy in Wolfram syndrome: MRI and clinical correlations. J Neurol Sci. 2014;346(1–2):250–256.
  9. Sequeira C, et al. Longitudinal MRI evidence of neurodegeneration in Wolfram syndrome. NeuroImage Clin. 2022; 35:103102.
  10. Hershey T, et al. Early brain vulnerability in Wolfram syndrome: MRI and DTI findings. Neuroimage. 2012;62(4):1538–1544.
  11. Esteban-Bueno G, Fernández-Martínez JL. Gonadal dysfunction in Wolfram syndrome: A prospective study. Diagnostics. 2025;15(13):1594.
  12. Esteban-Bueno G, Berenguel-Hernández AM, Fernández N, Navarro M, Coca JR. Neurosensory affectation in patients affected by Wolfram syndrome: Descriptive and longitudinal analysis. Healthcare. 2023;11(13):1888.
  13. Breiman L. Random Forests. Mach Learn. 2001;45:5–32.
  14. Kuhn M, Johnson K. Applied Predictive Modeling. Springer; 2013.
Table 1. Demographic and Genetic Characteristics of the Spanish Wolfram syndrome cohort (1998–2024).
Table 1. Demographic and Genetic Characteristics of the Spanish Wolfram syndrome cohort (1998–2024).
Characteristic n (%) / Mean ± SD
Total patients 45
Sex Male:25 (55.5%)
Female: 20 (44.5%)
Genetic confirmation available 43 (95.6%)
Parental consanguinity 12 (26.7%)
Siblings within cohort 17 (37.8%)
Ethnicity White:39 (86%)
Romani: 4
Arab origin: 2
Vital status (2024) Survivors: 35
Deceased: 10
Age (survivors) 27.5 ± 11.1 years
Table 2. Genetic status of Spanish patients with Wolfram syndrome.
Table 2. Genetic status of Spanish patients with Wolfram syndrome.
Variable Categories Frequency in cohort
WFS1 genotype 1 = WFS1 homozygote
2 = WFS1 compound heterozygote
3 = WFS1 triple heterozygote (one allele with two mutations + the homologous allele with one mutation)
48.9% heterozygotes
Table 3. WFS1 mutations identified at the cDNA level.
Table 3. WFS1 mutations identified at the cDNA level.
Allele Variants identified (c.DNA)
Mutation 1 (Allele 1) ex4 c.409_424dup16; ex4 c.472 G>A; ex4c.489_424dup;
ex4 c.506delA; ex8c.1060_1062delTTC; ex8 c.1096 C>T;
ex8 c.1113 G>A; ex8 c.1124 G>A; ex8 c.1230_1233delCTCT;
ex8 c.1289 C>A; ex8 c.1558 C>T; ex8 c.1582 T>G;
ex8 c.2020 G>A; ex8 c.2051 C>T; ex8 c.2118 C>A;
ex8 c.2206 G>A; ex8 c.2209 G>A; ex8 c.2564 C>G;
ex8 c.873 C>A; ex8 c.963_966del4; ex8 c.977 C>T
Mutation 2 (Allele 2) ex4 c.409_424dup16; ex8 c.1113 G>A; ex8 c.1230_1233del;
ex8 c.1230_1233delCTCT; ex8 c.1329 C>G; ex8 c.1340 T>C;
ex8 c.1456_1457insT; ex8 c.1462_1463ins12; ex8 c.1511 C>T;
ex8 c.1525_1538del15; ex8 c.1558 C>T; ex8 c.1582 T>G;
ex8 c.1612 T>C; ex8 c.2020 G>A; ex8 c.2118 C>A;
ex8 c.2206 G>A; ex8 c.2206 G>C; ex8 c.2209 G>A;
ex8 c.2257 G>T; ex8 c.854 G>T; ex8 c.873 C>A; ex8 c.873C>A; ex8 c.963_966del; ex8 c1463_1474
Table 4. WFS1 mutations identified at the protein level.
Table 4. WFS1 mutations identified at the protein level.
Allele Variants identified (WFS1 protein)
Mutation 1 (Allele 1) None; Ala326Val; Ala684Val; Arg375His; Gln366X; Gln520X; Glu158Lys; Glu169GlyfsX2118; Glu674Arg; Glu737Lys; Gly736Arg; Gly736Ser; His322fsX; Phe354del; Phe354fs*; Phe854del; Phedel; Ser430X; Ser855fs*; Trp371X; Tyr706X; TyrX*; Val142Glyfs*118; Val142Glyfs*X; Val142fs*; Val142fs251*; Val142fsX; Val142fsX110; Val412Serfs*29; Val412SerfsX29; Val412Serfs*29; Y528D
Mutation 2 (Allele 2) None; Arg285Leu; Gln486Leufs*57X; Gln520X; Glu674Arg; Glu737Lys; Glu753X; Gly736Arg; Gly736Ser; His322Thrfs*; Leu447Pro; Phe538Leu; Pro504Leu; Ser443Arg; Trp371X; Tyr706X; Val142fs*; Val142fs110; Val142fs251*; Val142fsX110; Val412Serfs*29; Val412SerfsX; Val491ProinsLeuIleThrVal; Val509_Tyr513del5; Valfs*; Y528D
Table 5. Genetic and Interaction Variables Used in the Machine Learning Model. 
Table 5. Genetic and Interaction Variables Used in the Machine Learning Model. 
Variable Definition / Description
Mut1_Protein_Class Protein effect of mutation on allele 1.
Mut2_Protein_Class Protein effect of mutation on allele 2.
Mut12_Exon_Class Classification based on whether both mutations are located in the same exon (exon 4, exon 8) or in different exons.
Tipo_mut1_exon Type of mutation affecting allele 1 and its corresponding exon.
Tipo_mut2_exon Type of mutation affecting allele 2 and its corresponding exon.
Genetic_Condition Categorical variable indicating zygosity (homozygous, compound heterozygous, triple heterozygous).
Wolframin_Class Classification of wolframin protein production: Type 0: No protein (premature stop codon).Type 1: ~50% protein, likely non-functional (misfolding).Type 2: Misfolded protein from both alleles. Type 3: Autosomal dominant case, ~50% normal protein production.
Prod_wm1 Interaction term: Wolframin class × Mut1_Exon_Class.
Prod_wm2 Interaction term: Wolframin class × Mut2_Exon_Class.
Prod_wm12 Interaction term: Wolframin class × Mut12_Exon_Class.
Prod_wmg Interaction term: Wolframin class × Genetic_Condition_Class.
Prod_mgm1 Interaction term: Genetic_Condition_Class × Mut1_Exon_Class.
Prod_mgm2 Interaction term: Genetic_Condition_Class × Mut2_Exon_Class.
Prod_mgm12 Interaction term: Genetic_Condition_Class × Mut12_Exon_Class.
Table 6. Key Clinical-Genetic Findings in Wolfram Syndrome.
Table 6. Key Clinical-Genetic Findings in Wolfram Syndrome.
Symptom Prev (%) Onset (yrs) M (%) Homozygotes (%) Key Genetic Findings Wolframin Class
Dysphagia 60 23.1 ± 7.3 66.7 63 Val142fsX110; Type 0 (>70%)
Sialorrhea 56 24.0 ± 7.4 60.0 68 Val142fsX110 Type 0 (68%)
Absence of Gag Reflex 67 23.4 ± 6.4 53.3 63 Val142fsX110, Trp371X; Type 0 (67%)
Dysmetria 44 24.1 ± 5.7 60.0 75 Val142fsX110 Type 0 (80%)
Gait Instability 64 26.0 ± 4.5 62.5 62 Val142fsX110 &
ex4 c.409_424dup16;
Type 0 (69%)
Ataxia 53 26.0 ± 4.5 62.5 75 Val142fsX110 Type 0 (83%)
Cognitive Decline 36 29.9 ± 12.2 56.2 81 Val142fsX110 & Trp371X; Type 0 (81%)
Anosmia 40 26.3 ± 6.2 55.6 78 Val142fs251 & Val142fsX110 Type 0 (78%)
Table 7. Machine Learning Model Performance and Key Feature Importance for Neurological Outcomes in Wolfram Syndrome-. 
Table 7. Machine Learning Model Performance and Key Feature Importance for Neurological Outcomes in Wolfram Syndrome-. 
Symptom Accuracy
(%)
Naïve Majority
Voting (%)
Top Predictive Features
(Feature Importance %)
Key Insights
Dysphagia 88.9% 60.0% mut2_protein_class (30.3), mut1_protein_class (23.9),
prod_wm12 (7.7)
Allele 2 mutations (esp. Val142fsX110) are strongest predictors; interaction of wolframin class with exon classification improves discrimination.
Sialorrhea 91.1% 55.6% mut2_protein_class (26.3), mut1_protein_class (21.2), prod_mgm12 (9.2) Nearly identical predictors as dysphagia, supporting shared pathophysiology involving swallowing and oral motor control.
Absence of Gag Reflex 93.3% 66.7% mut2_protein_class (28.1), mut1_protein_class (23.1), prod_mgm1 (6.0), prod_wm2 (5.7) Highest model accuracy; predicts airway protection risk with high sensitivity.
Dysmetria 88.9% 55.6% mut2_protein_class (25.7), mut1_protein_class (22.3), prod_mgm2 (6.0),
prod_wm1 (5.8)
Cerebellar dysfunction best explained by truncating mutations in exon 4 leading to wolframin loss.
Gait Instability 91.1% 64.4% mut2_protein_class (25.1), mut1_protein_class (19.2), prod_wm2 (11.2),
wolframin_class (4.1)
Interaction of wolframin class with allele-specific mutation data is critical for prediction.
Ataxia 93.3% 53.3% mut2_protein_class (22.7), mut1_protein_class (14.1), wolframin_class (7.9),
prod_wm12 (7.8)
Strong predictive value of wolframin class confirms direct cerebellar vulnerability to total protein loss.
Cognitive Decline 88.9% 64.4% mut2_protein_class (26.6), mut1_protein_class (19.5), mut12_exon_class (6.9),
prod_mgm2 (6.9)
Later-onset complication with strongest association to homozygosity and wolframin absence.
Anosmia 91.1% 60.0% mut2_protein_class (27.4), mut1_protein_class (24.9), prod_mgm2 (6.2),
mut2_exon_class (5.3)
Allele 2 mutations strongly predict anosmia, suggesting olfactory vulnerability to ER stress is allele specific.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated