2. Computational Framework
This section presents the computational framework developed to model cognitive impairment progression and assess Alzheimer’s Disease (AD) risk in younger adults. Grounded in the
Time Factor Hypothesis, the model is designed to capture the non-linear trajectory of biomarker changes leading to cognitive decline, potentially decades before clinical symptoms manifest. The conceptual diagram for this fit is represented in
Figure 1 starting from Biomarker Data → Simulation → Risk Zones → Correlative Risk Scoring → Risk Classification.
Axiom: The framework is based on gradual pathological changes in the brain’
The gradual pathological accumulations follow a non-linear progression — starting subtly, accelerating over time, and eventually plateauing. This biological behavior is best represented by a sigmoid function, which allows the model to simulate early-stage deviations in biomarkers before cognitive impairment becomes clinically apparent.
The model incorporates the following key biomarkers, each weighted based on its relative contribution to cognitive risk:
Cerebrospinal Fluid (CSF) A
Amyloid PET imaging
CSF Tau and phosphorylated Tau (p-Tau)
MRI FDG-PET (brain metabolism)
The computational framework is structured into three main components:
Descriptive Analysis – Under this cadre, we performed a descriptive analysis to establish the expected physiological ranges for the cognitive risk associated with each biomarker. These baseline values were modeled using a sigmoid function to generate a synthetic dataset that captures the biomarker’s variability over time and across age groups.
Parameter Accumulation – This component tracks the progression and accumulation of biomarkers over time. By modeling these trajectories, we assessed deviations from normal levels, providing insights into the temporal dynamics of each biomarker in relation to AD risk.
Correlation and Classification – We analyzed the correlation between biomarker accumulation and neuronal changes associated with Alzheimer’s disease. This enabled us to classify cognitive risk into distinct categories — normal, mild risk, or high risk — based on biomarker fluctuations and their combined effect on Cognitive Impairment (CI) scoring.
2.1. Descriptive Analysis
Previous studies [
16,
17,
18,
19,
20] have evaluated the expected average levels of key cerebrospinal fluid (CSF) biomarkers across different age groups and populations, including individuals living with HIV infection. Based on these findings, CSF A
levels below 480 pg/mL or above 800 pg/mL are considered clinically significant indicators of cognitive health status. Specifically, reduced A
levels suggest amyloid plaque accumulation, while elevated levels are typically associated with normal cognitive function.
For CSF Tau, age-specific thresholds have been proposed: levels should remain below 300 pg/mL for individuals aged 21 to 50, below 450 pg/mL for those aged 51 to 70, and under a critical threshold in individuals aged 70 to 90. Similarly, in Amyloid PET imaging, a Centiloid score of 0 is typical in younger adults, while scores approaching 100 are indicative of mild neurodegenerative changes.
Further research shows that individuals with CSF A
levels between 600–800 pg/mL generally maintain normal cognitive function, whereas levels falling below 480 pg/mL are linked to progressive cognitive decline [
10]. Additionally, [
11,
24] observed that Amyloid PET values less than 7 and CSF Tau levels below 7 are commonly found in cognitively normal individuals. In contrast, Amyloid PET values exceeding 7–10 correlate with amyloid positivity and an increased risk of Mild Cognitive Impairment (MCI) and Alzheimer’s disease. Tau levels above the 7–10 range are also associated with early neurodegenerative processes.
Cognitive impairment scores
further contextualize these biomarkers, with scores below 3
associated with normal cognitive aging and scores above 6 indicating early-stage cognitive impairment [
24]. Integrating these biomarker thresholds with cognitive impairment scoring provides a structured and quantifiable framework for classifying individuals into cognitive risk zones relevant to Alzheimer’s disease onset and related neurodegenerative disorders.
2.2. Sigmoid Simulation & Parameter Accumulation
To standardize the accumulation of biomarker values and imaging results relative to age, we modeled the biomarker measurements using a sigmoid function, defined in Equation
1. In this formulation,
L represents the maximum potential value of a given parameter, while
k serves as a scaling factor to adjust for variability in the input data. The term
denotes the individual biomarker measurement, and
represents the mean of the respective biomarker column, acting as a reference point for standardization. The sigmoid function is mathematically expressed as:
This function effectively constrains the output between 0 and L, making it well-suited for classification tasks where the goal is to assess the likelihood of an individual belonging to a specific cognitive risk category. Within the context of this research, the sigmoid function enables the stratification of individuals into normal, mild-risk, and high-risk groups based on their biomarker profiles associated with Alzheimer’s Disease (AD).
To further analyze biomarker progression, we computed the derivative of the sigmoid function,
, to identify critical points and ensure smooth curve behavior. By leveraging regression analysis alongside the derivative
, we reverse-engineered feature distributions, allowing for the controlled generation of synthetic instances representing individuals aged 10 years and older. This approach enriched the dataset, supporting the modeling of early biomarker changes potentially preceding clinical symptoms (see Figure
Table 1 for reference).
Henceforth, the term donor may be written as donor to reflect the enrichment and synthetic extension of the original dataset.
2.3. The Dataset
Most existing datasets in Alzheimer’s research predominantly comprise data from older individuals, typically aged 50 years and above. However, due to the scarcity of available data for younger individuals, particularly those aged 30–50, and the defined nature of available biomarker data —often either MRI imaging or numerical with categorical values, but rarely both— this study adopts a hybrid dataset approach to enrich the dataset and broaden age representation.
The dataset construction involved aggregating data from Kaggle, ANDI, and OASIS, followed by rigorous cleaning and filtering to retain relevant features. When duplicate or similar entries were identified, data were grouped and averaged, with
Age serving as the primary instance identifier. To address missing data, particularly for individuals under the age of 50, synthetic instances were generated using the sigmoid simulation described in Equation (
1).
Table 1 presents a snapshot of the resulting biomarker dataset designed for Alzheimer’s Disease (AD) risk assessment. Notably, the dataset starts from age 10, reflecting an intentional focus on early-stage biomarker progression rather than traditional cohorts limited to older populations.
Initial observations suggest that CSF A levels increase with age during early development, potentially reflecting normal physiological changes before the expected decline associated with AD. Similarly, Amyloid PET and CSF Tau levels demonstrate gradual increases, indicating progressive biomarker changes that may begin well before clinical symptoms emerge.
The dataset includes the following key attributes:
Age: The individual’s age (beginning at 10 years).
CSF A: Cerebrospinal Fluid Amyloid Beta 42 levels, a biomarker indicating amyloid plaque accumulation, a hallmark of AD.
Amyloid PET: Positron Emission Tomography measurements of amyloid deposition in the brain, where higher values denote greater amyloid accumulation.
CSF Tau: Levels of tau protein in cerebrospinal fluid, serving as an indicator of neurodegeneration associated with AD.
MRI FDG-PET: A neuroimaging metric capturing structural and metabolic brain changes.
This enriched dataset enables the investigation of biomarker dynamics across a broader age range, offering valuable insights into early-stage Alzheimer’s risk assessment.
2.4. Correlation Analysis and Cognitive Risk Categories
It is critical to identify correlations between biomarkers and establish classification regions that stratify
donors into
normal (no_risk),
mild_risk, and
high_risk cognitive categories associated with AD. For instance, the significance of CSF A
is that its reduction signals amyloid plaque accumulation in the brain, a hallmark of Alzheimer’s disease (AD) [
17].
Furthermore,
Figure 2 illustrate two key relationships in the dataset.
Figure 2a presents the age-dependent trajectory of CSF A
levels, displaying a sigmoidal trend. CSF A
levels rise gradually during early life (ages 10–30), possibly reflecting normal amyloid metabolism. From midlife (ages 30–60), A
levels increase more rapidly, potentially indicating changes in amyloid clearance efficiency. Levels plateau in later years (60+), likely due to reduced clearance or plaque accumulation in brain tissues.
This trajectory aligns with established AD biomarker models, where CSF A
concentrations decline in individuals with amyloid pathology [
5,
16,
21].
Figure 2b highlights the correlation between CSF A
and CSF Tau levels. The scatter plot and regression line reveal an inverse relationship: as CSF A
decreases, CSF Tau increases sharply. This supports the hypothesis that amyloid deposition (low A
) is linked to neurodegeneration (elevated Tau), both of which are critical to AD progression.
Applying predefined medical thresholds,
Section 2.1 provides clear cutoffs for biomarker levels, enabling the classification of cognitive states (
Figure 3). This visualization traces biomarker trajectories across the lifespan, highlighting cognitive risk zones. The shaded backgrounds (green, yellow, orange, and red) indicate transitions from normal cognitive function to mild cognitive impairment (MCI). With increasing age, deviations in biomarker levels become more pronounced, particularly in individuals transitioning into high-risk or MCI categories.
These cognitive risk zones are defined as:
- (1)
Normal: Biomarker levels within safe physiological ranges
- (2)
mild Risk: Slightly elevated biomarker levels indicating potential early changes
- (3)
High Risk: Significant biomarker abnormalities but without formal clinical diagnosis
- (4)
MCI (Mild Cognitive Impairment): Biomarker levels exceed critical thresholds indicating cognitive deterioration
As illustrated, CSF A (purple) declines sharply with age, while Amyloid PET (red), CSF Tau (yellow), and MRI + FDG PET (blue) show progressive increases. The green cognitive impairment curve mirrors this upward trend, reinforcing the relationship between biomarker deviations and cognitive decline.
Figure 4 models the progression of cognitive impairment as a function of age, segmented into cognitive risk categories. Data points, color-coded by risk level, reveal a nonlinear increase in cognitive impairment over time. Initially, most individuals remain in the normal range (blue). As age advances, the probability of transitioning into mild-risk (orange), high-risk (green), and MCI (red) increases significantly.
The trajectory indicates a critical period around midlife (50–60 years), where cognitive risk accelerates sharply. This observation aligns with neurodegenerative models suggesting that biological and cognitive reserves initially buffer against decline until cumulative damage leads to rapid deterioration.
These findings emphasize the importance of early detection and monitoring. Individuals classified as mild-risk still represent a key intervention window where preventive strategies could delay or mitigate progression. Integrating machine learning techniques could further enhance this model by identifying subtle early indicators of cognitive impairment.
Considering
Figure 3 and
Figure 4, the central research question emerges:
What biomarker values can individuals aged 30–40 (and 40-50) maintain to remain within the safe (green) zone for healthy AD-free old age? Or put it in another way, what thresholds signal progression toward mild or high risk of developing Alzheimer’s disease at an older age?