An Amyloid Agnostic Reformulation of the Alzheimer’s Disease: the Long Gene Vulnerability Hypothesis

Alzheimer’s disease (AD) is a genetically complex senile neurodegeneration with unknown etiology. The first gene discovered to be mutated in early-onset AD, the amyloid precursor protein (APP), has been widely assumed as a causal factor in the disease cascade due to its generation of Aβ species. APP has an evolutionarily conserved biological role and activates a signaling program with notable similarities to integrin—a cell adhesion receptor with a wide array of functions. Intriguingly, several AD genome-wide association study (GWAS) candidate genes, including the SHARPIN locus recently reported by us and others, influence signaling of the integrin pathway. Integrins are focal adhesion regulators and serve in nervous system development, synaptic plasticity, and Tau phosphorylation. These observations suggest that the function of APP probably goes beyond Aβ generation in AD. Aging—the strongest risk factor for AD—is associated with various clock-like events in cells. For instance, neurons are continuously impacted by stochastic ‘hits’ to their genomes in aging, in the forms of DNA damage, insertion-deletions, copy-number variations (CNVs) and other types of somatic mutations. DNA damage and somatic mutations can result in neoplastic changes and cancer in mitotically active cells. However, their consequences in post-mitotic cells such as aging neurons are less defined. The current hypothesis holds that the stochastic loss of DNA sequence data at random loci in aging affects longer genes by chance more frequently. As a result, the biological processes coordinated by long genes may be more vulnerable to such random aging effects. Curiously, as shown by us and others, long genes are strongly enriched for synapse- and cell adhesion-related ontologies, more than any other biological process or cellular compartment. In addition, among various cell types, neurons possess the highest levels of long gene expression and are therefore more vulnerable to such harmful effects. The long gene vulnerability hypothesis provides a simple link between aging and the genetic landscape of AD and warrants new strategies for disease modification.


Introduction
The Aβ fragment of the APP protein 1 has been the centerpiece of AD pathogenesis research and drug design following the amyloid cascade theory 2 . However, more than three decades after the successful cloning of the APP gene 3 , the biological function of its encoded protein remains speculative in the brain and elsewhere. Mounting evidence indicates that APP acts as a cell surface receptor and activates an intracellular signaling program 4,5 for synaptic function and plasticity 6 . Still, this essential biological function has received less attention in the field.
Sporadic late-onset AD is a genetically complex disease with a heritability of 60-80% 7 . GWAS and next-generation sequencing have identified multiple risk loci for late-onset AD in the last decade [8][9][10][11] . These loci provide a hypothesis-free glimpse of the underlying molecular pathways in AD and bring opportunities for revising disease models in a data-driven way. APP, presenilins (PSENs), and Tau variants have shown small contributions to the total heritability of late-onset AD 10,12 , whereas the APOE locus explains approximately a quarter of the disease heritability 13 . The GWAS loci have been suggested to highlight several pathways in AD pathogenesis, spanning microglial activation, lipid metabolism, focal adhesion, and Aβ turnover [14][15][16][17] . Nevertheless, the causal significance of Aβ in AD is a matter of ongoing debate [18][19][20] . The correlation between Aβ and brain atrophy seems to be weak, absent, and in recent reports paradoxically negative 21 .
It is argued here that the model of AD pathogenesis can be surprisingly simplified by reimagining Aβ deposition and Tau phosphorylation as potential consequences of the disease process rather than causal factors. Several testable predictions are proposed together with new disease modification strategies.

APP may be a synaptic adhesion molecule: the evolutionarily-conserved NPxY motif
The intracellular domain of APP attracts adaptor molecules with signaling activity [38][39][40] . Specifically, the APP intracellular domain constitutes an NPxY amino acid motif that has been super-conserved from roundworms to humans for more than 900 million years of evolution 41 . NPxY is a consensus motif for receptor sorting and intracellular signaling. For instance, integrins recruit their cytoplasmic adaptors (e.g., kindlin and talin) via a cytoplasmic NPxY motif, and this event ultimately affects the remodeling of the actin cytoskeleton 42 . Similarly, the NPxY-binding APP intracellular adaptors converge to the same cytoskeletal actin pathway 4,23 (table 1). Notably, APOE lipoprotein receptors also recruit the same NPxY motif for signaling (as shall be discussed in the next sections).
Several lines of evidence suggest that APP may be a synaptic protein. At the postsynapse, APP interacts with AIDA1, a synaptic plasticity regulator, via its NPxY site 43,44 . The APP family proteins form trans-synaptic adhesion dimers, stabilize synaptic connections 45 and coordinate neurotransmitter receptor function 46,47 . Taken together, the signaling function of APP 6 seems to overlap with that of integrin cell adhesion and its influence on synaptic plasticity 48,49 . This amyloid-independent role of APP dovetails with the body of GWAS evidence and the genetic architecture of AD.

Γ-secretase may be a synaptic adhesion modulator
Although γ-secretase dysfunction has been primarily researched in the context of Aβ generation, the function of this transmembrane protease is not limited to APP cleavage 83 . Several receptors such as notch 84 , which is a novel familial AD candidate gene 85 , rely on γ-cleavage for normal signaling. Other γ-secretase substrates associated with AD include the APOE lipoprotein receptors (LRPs 86 ) and the ephrin synaptic adhesion receptors 87 , both of which regulate neurotransmission 86,88 and interact with the integrin adhesion complex 89,90 . Synaptic maturation is accompanied by an increased expression of γ-secretase at the postsynaptic membranes 91 , where this enzyme is anchored to various cell adhesion molecules [91][92][93] . Loss of γ-secretase activity disrupts membrane adhesion force generation 94 and causes erroneous axonal pathfinding 95 . Taken together, γ-secretase is essential to the signaling of multiple synaptic adhesion receptors other than APP, a physiological cleavage process that has been hardly explored in AD pathogenesis.
The genome-wide landscape of AD and synaptic adhesion Pioneered by Lambert et al., multiple GWAS risk loci have been discovered for late-onset AD in the last decade 12,96 . Curiously, GWAS candidate genes seem to strongly converge to the integrin cell adhesion pathway-a mechano-chemical signaling event that transfers extracellular matrix (ECM) signals to the internal actin cytoskeleton and vice versa. Integrins are heterodimeric receptors generated from 18 α and eight β subunits in humans. These cell adhesion receptors coordinate bidirectional communications between the cell and the ECM, for instance, in synapse development and plasticity modulation 48 . Many of the GWAS loci code for proteins that interact with the β1 integrin pathway, such as the Src family kinases (SFKs), focal adhesion kinase (FAK), and actin reorganizers (table 2 and Fig. 1). Notably, the integrin pathway prevents Tau phosphorylation via integrin-linked kinase (ILK). In this context, tau phosphorylation and ILK change the plasticity of cytoskeletal actin in neurites, an essential process for synaptic reshaping and outgrowth [97][98][99][100][101] .
The endocytosis machinery has been implicated by multiple risk loci of AD, including BIN1, PICALM, 102 and ABCA7 103 . Endocytosis via clathrin 104 and IDOL-dependent pathways 105 regulates synaptic biology. Synaptic endocytosis modulates the strength of transmission by redistributing the pool of transmitter receptors from the postsynaptic density membrane (active state) to the intrasynaptic space (inactive state). For instance, LRPs are endocytic receptors of the APOE molecule and modulate postsynaptic glutamate receptor trafficking and plasticity 106,107 . Both LRPs and integrins recruit the clathrin-mediated endocytosis machinery via their NPxY motifs 108 . Intriguingly, kindlin-2, a known AD risk locus, links the LRP-mediated endocytosis to the integrin cell adhesion pathway 109 . Long synaptic adhesion genes implicate DNA damage as the cause of AD Cell adhesion enlightens new mechanisms of AD pathogenesis. Curiously, among all pathways and cell compartments documented in the gene ontology (GO) database, cell adhesionand synapse-related ontologies show the strongest enrichment of long genes 177 . In addition, gene expression data shows that neurons highly express long genes, more than any other cell type 178 . While the reason for this statistical overrepresentation is elusive, it may be speculated that long genes may have increased the complexity of cell signaling pathways in evolution, such as those of brain development and synaptic connectivity. Long genes often possess long introns and more transcription factor binding sites. Also, long genes usually code for larger proteins with larger surface areas and more interaction sites. Elements of the neurodevelopmental program, such as axon guidance, neural migration and synapse formation, may rely on signaling complexities enabled by long gene products. Importantly, these large molecules also contribute to the postdevelopmental plasticity of the synapse 179 .
Several independent groups have recently reported that somatic mutations and insertiondeletions (indels) accumulate in aging brain neurons at a more or less linear rate [180][181][182][183] . A long synaptic gene may be more vulnerable to such DNA damage events and somatic mutations (genosenium) that emerge in aging cells. Also, long neuronal genes often reside in chromosomal fragile sites and hot-spots of genome instability 184,185 , a feature that may render them more vulnerable to DNA damage in aging. Loss of long neuronal genes due to DNA damage accumulation may be more or less similar to the mutational loss of long tumor suppressor genes in cancers 186 , albeit with some distinctive features due to the post-mitotic state of neurons.
The biological pathways coordinated by long genes, synapse and cell adhesion, compile the genome-wide landscape of AD with APP, γ-secretase and APOE. Aβ generation and Tau phosphorylation may be downstream consequences of this causal mechanism (please see the next sections).
Testing the hypothesis: LRP1b, DAB1 and CSMD1 under the spotlight The medial temporal lobe neurons express certain synaptic genes such as the NMDA receptor subunits for regulating plasticity and memory formation. NMDA receptors are coupled with synaptic adhesion molecules and cytoskeletal actin (Fig. 1). Due to such proteomic diversity in different brain regions and cells, some neurons may be more vulnerable to aging and DNA damage, for example if they incorporate multiple genes that are mutationally fragile in aging. While an extensive and exploratory search of long and fragile genes in AD may be helpful, three genes have interesting features warranting focused research (table 3): • LRP1b codes for a receptor of the APOE molecule. LRP1b is the ninth-longest gene in the human genome and is selectively expressed in the hippocampal formation 187,188 (Fig.  2). This giant receptor maps to the chromosomal fragile site FRA2F and is among the top-ten genes frequently deleted in cancers 189 . Considering its interaction with the postsynaptic density protein PSD95 190 , the synaptic plasticity regulator PICK1 191 , and the APP protein 192 , LRP1b may have postsynaptic roles. The biological functions of its closest homolog, LRP1 (with 59% sequence similarity), may help speculate potential synaptic roles of LRP1b. LRP1 regulates postsynaptic glutamate receptor trafficking, long-term potentiation 193 and integrin signaling 194 . Both of these receptors have two NPxY motifs.
• DAB1 is a mandatory signaling adaptor of the APOE/RELN signaling axis, an essential biological pathway in the perforant synaptic path of the medial temporal lobes 195,196 . DAB1 is coded by the 13 th longest gene in the human genome and maps to the chromosomal fragile site FRA1B.
• CSMD1 is another long synaptic gene with tumor-suppressor-like fragility 197 . This gene, which is the sixth-longest gene in the human genome, resides at the chromosomal fragile site FRA8B and prevents activation of the complement system 198 . As a giant synaptic membrane adhesion molecule, CSMD1 is strongly expressed in the hippocampal formation 188 . These features warrant research into the potential loss of CSMD1 in the aging brain and its possible influence on complement activation, synaptic pruning 199 and integrin signaling 200 . Notably, the C3b-4b complement complex-a cognate ligand for the AD risk locus CR1 receptor-is degraded by CSMD1 198 .  DNA damage has been suspected as a mechanism of neurodegeneration and AD for some time 18,[204][205][206] . Single-cell sequencing has recently revealed an accumulation of somatic mutations in human brain neurons [181][182][183]207,208 , a process termed genosenium. A number of preliminary works have surveyed somatic mutations in AD and non-demented brain neurons with inconsistent results 183 . It is noteworthy that the survivorship bias probably confounds single-cell mutational readouts, since different subtypes of neurons show variable degrees of vulnerability to AD. As much as 90% of vulnerable neurons may be lost in severe AD 209 . In support of this notion, healthy brains seem to lose a substantial proportion of neurons with higher mutational loads in aging 210 . Compared to non-demented brains, AD brains show a reduced number of somatic mutations. While inconclusive, this observation may suggest that neurons with higher mutational loads are generally more vulnerable in the aging brain and are (more) easily depleted in AD. In addition to single nucleotide variant (sSNV), further studies are needed to quantify copy number variations (CNVs) and indels in AD neurons, since these less-explored types of somatic mutations frequently impair long genes at fragile sites, some of which have neuronal roles 211 . Considering the post-mitotic state of neurons, an interesting question is whether DNA strand break and repair cycles in neurons affect fragile site genes similar to the effect of cell division cycles in cancer pathogenesis 212 .
The current hypothesis brings new elements to the DNA damage theory of aging. Long genes are postulated to be more susceptible to DNA damage and its consequences, such as somatic CNVs and SNVs. This phenomenon is predicted to disable long genes and affect essential synaptic processes, such as the postsynaptic adhesion complex and fragile site genes (table 3).

Long gene vulnerability and the amyloid cascade theory
The APP molecule is probably one member of a large synaptic adhesion interactome, rather than a central disease factor (Fig. 1). Some members of this interactome may be vulnerable to DNA damage in aging, causing others to appear as disease risk loci. For example, mutational loss of the LRP1b gene, whose protein product binds APP and affects its cleavage 192 , may increase Aβ generation as an indirect effect of DNA damage. As noted for its closest homolog LRP1 97-100 , another potential consequence of this event is β1 integrin dysfunction, with Tau phosphorylation taking place in this cascade 98 . Taken together, neuropathology and the proteinopathy in AD may represent consequences of altered signaling events, rather than causal factors. Following this assumption, the current hypothesis is incompatible with the amyloid cascade theory.
Glial cells, innate and adaptive immunity and the complement system As a part of the innate immune system, the complement cascade controls synaptic pruning in the developing brain and in psychiatric disorders by tagging unwanted synapses for removal 214 . Microglia cells recognize activated complement proteins deposited on the synapse via a ligandreceptor interaction 215 . The genetic architecture of AD seems to implicate some degree of overlap between glial-specific genes and neurodegeneration. In support of the complement system and its potential role AD, the extremely long and fragile synapses gene, CSMD1, prevents complement activation (please see above). Nevertheless, it remains unknown whether neuroinflammation is a cause or a consequence of the disease pathogenesis mechanisms. Notably, somatic mutations in cancer cells result in the generation of novel peptides (neoantigens) that are unknown to the immune system and elicit an immune response 216 . Whether somatic mutations in synaptic genes may cause immune activation remains an open and interesting question.

Conclusion
Aging is associated with an accumulation of random 'hits' to the DNA base sequence in the form of DNA damage, CNVs, SNVs and other types of somatic mutations. This process can result in carcinogenesis in mitotically active cells, but its effects have yet to be understood in post-mitotic neurons. Long synaptic genes may be more vulnerable to this random process and form a bottleneck in healthy brain aging, since they contain more 'information' (lower entropy) that is more probable to be lost in time. In addition, long genes often map to chromosomal fragile sites and mutational hotspots. Compared to healthy individuals, the pace of the mutational accumulation may be higher in AD patients, and/or the resistance threshold of neurons to such harmful effects of aging may be lower, causing earlier cell death or dysfunction. Long gene vulnerability warrants new disease modification strategies for the treatment of AD. Breen, K. C. APP-collagen interaction is mediated by a heparin bridge mechanism. Molecular and chemical neuropathology 16, 109-121 (1992). 26 Li