Analyzing the Distribution of Mutations for Glycogen Storage Disease Type 1a in Turkey and Suggested Gene Therapy Methods for Its Treatment

One of the rare diseases throughout the world is Glycogen Storage Disease, which appears due to problems in glycogen metabolism. Among various subtypes of GSD, GSD Type 1a is the most abundant one of GSD Type 1, seen in approximately 80% and caused by different kinds of mutations in the Glucose-6-Phosphatase Catalytic Subunit (G6PC) gene in human chromosome 17q21. G6PC gene encodes for glucose-6-phosphatase (G6Pase) protein, which cleaves glucose-6-phosphate into glucose and inorganic phosphate (Pi), and GSD Type 1a patients fail to breakdown glucose-6-phosphate due to several mutations in the G6PC gene. In our study, we aim to create new therapeutic approaches for GSD 1a. We collected mutation data of 57 GSD Type 1a patients from Turkey. According to the data, 16 types of mutations were observed in the G6PC gene. Allele frequencies of these mutations are calculated as 59% for R83C/H, 11% for W160*, 7% for G270V, and 28% for others which have less frequency. Up to now, the tertiary protein structure of G6Pase has not been structured yet. To understand the possible impacts of these mutations, we statistically obtained possible tertiary structure predictions of G6Pase by running 5 different tools. At the end of the study, we suggest two effective and promising gene therapy methods for GSD Type 1a, Prime Editing for R83C/H mutations, and mRNA delivery for other mutations, in addition to a promising, commercially available drug suggestion for patients with W160*, W86*, and S15* mutations, although the drug belongs to another disease.


INTRODUCTION
Glycogen storage disease (GSD) is a rare, inherited disease caused by enzyme deficiencies in glycogen metabolism . GSD mainly causes problems in the liver, skeletal muscles, heart, kidneys, and central nervous system ; J. I. . Different types of GSD are identified according to their enzymatic deficiencies, which take part in the synthesis or degradation of glycogen . One of the major subtypes of GSD is GSD Type 1, and it is caused by glucose-6-phosphatase (G6Pase) enzyme deficiency, which regulates glucose homeostasis and is essential for the breakdown of glucose-6-phosphate (G6P) into glucose and inorganic phosphate (Pi) and its enzymatic activity is essential for the transport of G6P into the Endoplasmic Reticulum lumen . GSD Type 1 is also classified into 3 main subcategories as GSD Type 1a, GSD Type 1b, and GSD Type 1c . While an error in the glucose-6phosphatase (G6PC) (human chromosome 17q21) causes to GSD Type 1a, an error in glucose-6-phosphate translocase (G6PT) (human chromosome 11q23) leads to GSD Type 1b , and another error in phosphatase translocase (human chromosome 11q23-24.2) causes to GSD Type 1c . GSD Type 1 is seen approximately 1/100,000 births each year and 80% of GSD patients consist of the GSD 1a subtype .
The main glycogen storage units in the body are the liver and skeletal muscles, and the blocked glycogenolysis and gluconeogenesis due to liver problems  lead symptoms like hypoglycemia with higher lactic acid, triglyceride, and uric acid levels after birth due to excessive G6P amount . Other symptoms include muscle cramps, muscle weakness, fatigue , and exercise incapability due to glycogen accumulation in skeletal muscle, and bulged abdomen due to hepatomegaly . Also, tremor, increase in ventilation rate, sweating, pale skin are usually seen during sleep due to fasting .
Most GSD Type 1 patients could not survive until their adolescence; they usually pass away during their childhood . Over time, thanks to strict and frequent nutrition treatment of uncooked cornstarch, applied during the whole day and night, GSD Type 1 patients evade from hypoglycemia and higher lactic acid levels . Although an increase in their life expectancy is seen; hepatic, renal, and intestinal problems still emerge as they grow up . Hence, hepatic tumors or chronic renal diseases cause to death of GSD Type 1 patients who are in their third decade. These tumors are mostly diagnosed as hepatocellular adenomas, which can show the malignant transformation, and a clear majority of the patients suffer from albuminuria and proteinuria, ending up in a need for dialysis or renal transplantation (J. P. . The human G6Pase gene (G6PC), isolated by , is localized at chromosome 17q21, extends for 12.5 kb, has 5 exons, and codes for a hydrophobic glucose-6-phosphatase protein consisting of 357 amino acids, which form 9 transmembrane helices  anchored to endoplasmic reticulum membrane with the amino (N) terminus in the lumen while the carboxyl (C) terminus in the cytoplasm . Ghosh et al. predicted the amino acids in the catalytic activity site of a glucose-6-phosphatase enzyme (G6Pase) include Lys76 (K76), Arg83 (R83), His119 (H119), Arg170 (R170), and His176 (H176) . During catalysis, His176 residue in G6Pase gets phosphorylated and forms an enzymephosphate intermediate.  included that since all mutations were shown to rescind G6Pase activity (Ke Jian , these residues are essential in phosphatase action (K. J. . More than 200 different variations have been identified within G6PC gene in the patients affected with GSD worldwide . Prime editing is a precise, highly efficient genome editing method which can perform edits on targeted insertions, deletions, and all point mutations, without making double-strand breaks, demanding donor DNA templates, or producing plenty of byproducts (Anzalone et al., 2019). Programmable nucleases such as CRISPR-Cas9 create double-strand DNA breaks (DSBs), which can generate mixtures of insertions and deletions at target sites . Unfortunately, DSBs can lead to unwanted results, which may bring about excessive byproducts, translocations , and p53 activation . Prime editing, instead, uses Prime editors (PE), which recruits a reverse transcriptase (RT) combined to an RNA-programmable nickase (Cas9n (H840A endonuclease) and a prime editing guide RNA (pegRNA) that recognizes and specifies the target DNA site to write new genetic information according to the pegRNA into that particular targeted DNA region (Anzalone et al., 2019). Anzalone A.V., Randolph P.B., Davis J.R. et al. used prime editing in in vitro correction of the primary genetic causes of sickle cell disease (needs a transversion in HBB) and Tay-Sachs disease (needs a deletion in HEXA), efficiently and with few byproducts (Anzalone et al., 2019). Another promising therapeutic method is mRNA delivery, which has a high potential to treat diseases . Until now, mRNA delivery has been used in different clinical trials on vaccines of different viruses like HIV viruses , cancer immunotherapy for multiple cancers , and protein replacement therapies for cardiovascular diseases hemophilia B (DeRosa et al., 2016;. To express its mechanism briefly, in vitro-transcribed (IVT) mRNA is packaged into a delivery machine as a cargo and delivered into the target cell by different methods. Once taken into the cell, IVT mRNA is used in the protein synthesis system of the transfected target cell to promote the expression of the desired protein .
In this study, we aim to illustrate the common and novel mutations that lead to GSD Type 1a among 57 patients from Turkey and to offer useful novel gene therapy methods including CRISPR, prime editing, and mRNA delivery approaches to be able to obtain functional glucose-6-phosphatase amounts in the patients.

Data Collection and Analysis
It is important to fully understand the mutations in the G6PC gene that cause glucose-6-Phosphatase enzyme deficiency to establish a treatment process. Therefore, mutations in the G6PC genes of 57 GSD Type 1a patients from Turkey and their incidences were analyzed by Intergen Genetic and Rare Diseases Research and Diagnosis Centre in Turkey.

Tertiary Protein Structure Prediction
To be able to interpret the changes caused by R83C mutation, it was aimed to use the enzyme's tertiary structure and it was not found on databases such as Protein Data Bank. Therefore, to understand the tertiary structure of the enzyme, different tools have been used to obtain a prediction about the tertiary structure of the glycogen-6-phosphatase. The FASTA sequence obtained from the Uniprot database (The UniProt Consortium, 2019) was used to predict the tertiary protein structure. The sequence was examined by I-TASSER (Ambrish , Phyre-2 , ExPASy , CPHmodels (Nielsen et al., 2010;, and ROBETTA  3-dimensional modeling applications and 3D structures in PDB formats were obtained. Since each of these applications predicted more than one model, the most appropriate ones among them were determined. When the parameters of the tertiary structure modeling tools were investigated, it was understood that each tool has its parameter to display the accuracy levels of these predicted models. Although these parameters do not show the accuracy in percentage, they provided an insight to understand the real tertiary structure of glucose-6-phosphatase by creating these models.
The parameters and their values obtained from the tools are as following:  PHYRE-2: Mutant → Confidence %100, ID %20 Wildtype → Confidence %100, ID %20  I-TASSER: Mutant → C-Score = 0,46 Wildtype → C-Score = 0,46  ROBETTA: Mutant → Confidence = 0,23 Wildtype → Confidence 0,24  CPHmodels: It did not provide any parameters, only provided data in PDB format.  ExPASy: Mutant -> GMQE=0,31, QSQE=Not obtained, Identity = %17,96 Wildtype-> GMQE=0,32, QSQE=Not obtained, Identity = %18,45 High values of confidence and identity in the parameters of the model obtained from the PHYRE-2 tool show high accuracy of the predicted structure of the protein. In the obtained result, despite the high confidence value, which indicates a highly accurate amino acid modeling, from the identity value, it was understood that PHYRE-2 used only 20% of 357 amino acids found in both wildtype and variant Glucose-6-Phosphatase. For I-TASSER, C-Score, which is a confidence value, TM-Score and RMSD were expressed together. TM-Score was understood as a value that expresses the structural similarities of other proteins. It was found that, according to TM-Score, the model with values greater than 0.5 has a correct topology and those less than 0.17 have a random similarity. The RMSD was determined to express the value given to the inter-residue settlement. It was understood that the C-Score, created by the correlation of TM-Score with RMSD, indicated native protein structure between 0.75-0.91. Among 5 different wildtype and variant G6PC tertiary protein structures, predicted by I-TASSER, the most appropriate model was the second one with the best C-Score. For Robetta, it was determined that a confidence value of 0.0 gave the mispredicted tertiary structure, while a value of 1.0 gave a well-predicted tertiary structure. While CPHmodels did not provide any parameters, it only gave tertiary structure in PDB format to be displayed in PyMOL. For ExPASy, protein tertiary structure prediction is performed by homology modeling and it offers three different parameters. The GMQE value gives the distance between the template sequence and the sequence that we use. The QSQE value shows the similarity of the tertiary structure predicted with the amino acid sequence we use with the three-dimensional structures in the database. Both values should be close to 1 for a good prediction and a high ID value means a good tertiary structure prediction. After the modeling process, R83C change was applied to the wildtype glucose-6-phosphatase enzyme model using PyMOL (The PyMOL Molecular Graphics System, Version 1.2r3pre, Schrödinger, n.d.). Then, R83C mutation was done on the sequence obtained from Uniprot (The UniProt Consortium, 2019), and modeling was predicted in 3D modeling tools. The alignment between these models was made in PyMOL (The PyMOL Molecular Graphics System, Version 1.2r3pre, Schrödinger, n.d.). The alignment was also carried out between each model obtained from different tools to understand the difference between each 3D model of the G6Pase enzyme, in addition to the alignment between wildtype and variants. After displaying mutations on the tertiary structure of G6PC, gene therapy methods to edit these mutations were investigated. For the correction of R83C/H mutations, prime editing looks promising. To decide the required sequences for the prime editing system, NY Genome Prime Editing Tool (NY Genome Prime Editing Tool, n.d.) was used.

Prime Editing Mechanism
Prime editors (PE) contain prime editing-guide RNA (pegRNA) which recognizes and specifies the target DNA region to write new information on this targeted region, and reverse transcriptase (RT) coupled to a nickase (Cas9 endonuclease) which encodes the targeted sequence (Anzalone et al., 2019). During prime editing, the PE-pegRNA complex binds the target DNA region and nicks the PAM sequence-containing strand (Anzalone et al., 2019). After nick formation, the resulting 3' end hybridizes to the Primer Binding Site (PBS), then RT conducts reverse transcription of new DNA region which contains the desired target edit using the RT template of the pegRNA (Anzalone et al., 2019). An equilibration occurs between the edited and the unedited flaps, after intracellular flap cleavage and ligation, DNA repair results in edited DNA according to the desired change (Anzalone et al., 2019). Prime editor 2 (PE2) facilitates an engineered RT to have more efficient editing, while Prime editor 3 (PE3) nicks the non-edited DNA strand to promote its replacement and enhance editing efficiency (Anzalone et al., 2019).

Mutations from Turkey
In this study, the patients affected with GSD Type 1a in Turkey were analyzed, and among 57 patients (corresponding 114 alleles), 16 different mutations were identified on different exons and introns as in Figure 1. The rates of the most common mutations were observed as follows: -R83C is the most common variant, which found on 54% of the alleles -W160* is seen on %11 of the alleles -G270V is identified with the ratio of 7% of the alleles -R83H belongs to the 5% of the alleles Interpreting the mutation data from Intergen, changes of S15*, R83C, R83H, W86*, M121V, M121I, W160*, c.498_516del, c.547dupG, c.563-21G>C, G270V, c.592-593delAT (p.I198Pfs*5), V321D, c.1004delT, V338F, R295C were encountered. The distribution of the mutations to the intron and exons can be observed in Figure 1a. When the incidence of these mutations was investigated, as in Figure 1b, it was observed that 32 of 57 patients have c.247C>T (R83C), 7 of 57 have c.480G>A (W160*), 4 of 57 have c.809G>T (G270V), 3 of 57 have c.592-593delAT (p.I198Pfs*5), 3 of 57 have c.248G>A (R83H) and 2 of 57 patients have c.962T>A (p.V321D) mutation. The rest of the mutations was diagnosed at only 1 patient among 57. Some patients were diagnosed with multiple mutations, such as both M121I and W160* was found in a patient. Thus, Figure 1 was prepared according to the mutations found on alleles, instead of patients. Among the most frequent mutations, c.247C>T (p.R83C) mutation is seen as 54%, c.480G>A (p.W160*) mutation is seen as 11%, c.809G>T (G270) mutation is seen as 7% as in Figure 2.
Figure 2. Percentage of allele incidences. R83C/H is found on 59% of the alleles, while W160* is found on 11%. G270V is found in 7%. Other mutations can be seen on 28% of the alleles.
Among the mutations of patients from Turkey, 2 types of mutations were found to be in the catalytic site of the enzyme: R83C and R83H . These mutations were observed in 59% of the alleles of the patients. Located in the catalytic site and diagnosed on 59% of the alleles, the prime editing method looks promising for R83C and R83H mutations. Enzyme deficiency caused by other mutations may be eliminated by mRNA delivery since the incidences of other mutations are relatively low to design specific gRNAs for prime editing.

Tertiary Structure
Since the tertiary structure of the G6Pase enzyme has not been understood yet, possible tertiary structure predictions by different tools were made as in Figure 3. As observed in this study, although predicted tertiary structures of the Glucose-6-Phosphatase enzyme did not give significantly good values in any of these applications, it was understood that among these tools, PHYRE-2 and ExPASy performed the worst predictions, relatively, while I-TASSER predicted the best models. CPHmodels can be counted as one of the tools that resulted in "acceptable" predictions since it used more amino acid than other tools, although it used only 198 amino acids of the enzyme sequence, not the complete sequence of 357 amino acids. Besides, as a result of the modelings created by Robetta, I-TASSER, and CPHmodels tools and alignment of these models with PyMOL, changes in the protein structures with R83C mutation were observed. These changes did not make a significant difference between the wildtype mutant and the mutant type, made only a minor difference. However, it caused a change in the tertiary structure. In the alignment between Roberta, I-TASSER, and CPHmodels, it was understood that these tools predicted different threedimensional structures.

Prime Editing and mRNA Delivery
After tertiary structure prediction, by NY Genome Prime Editing Tool, prime editing sequences that are required to edit R83C mutation were decided as in Tables 1 and 2 (NY Genome Prime Editing Tool, n.d.).  The proposed mechanism for editing R83C/H mutations of G6PC by prime editing is discussed in detail in Figure 4, with the mechanism explained in the methods section. Also, for the rest of the mutations, G6PC mRNA delivery for treatment of GSD Type 1a can be seen in Figure 5   Cas9n nicks the PAM strand and the PE system makes reverse transcription on the pegRNA template. Next, cellular systems cleave DNA flap and edited DNA is polymerized into the target DNA region. (b) Exon 2 of variant G6PC on top, prime editing elements on middle, prime editing mechanism for correction of R83C variant beneath them. After hybridization between pegRNA and the target DNA region, pegRNA recognizes the PAM DNA strand to be nicked by Cas9 (H840A) nickase. Primer binding site (PBS) and PAM strand hybridize, and reverse transcriptase (RT) writes a new flap. After reverse transcription, two possibilities may occur: (i) flap equilibration gives hybridization of new DNA flap to the unchanged DNA strand and thanks to DNA repair mechanism, targeted change is made on the target DNA strand; or (ii) flap equilibration gives degradation of DNA flap and target DNA site stays unmodified. Figure 5. mRNA delivery method. Glucose-6-phosphatase mRNA is modified to increase its stability inside the cells and put in a delivery machine. The delivery machine is delivered to the patient by intravenous injection. The injected machine goes to the liver and is taken into the liver cell within an endosome and gets degraded by intracellular degradation pathways. Free G6Pase mRNA is translated in ribosomes and translated mRNA goes to the endoplasmic reticulum (ER) for post-translational modifications and then gets embedded into the ER membrane to work fully. DISCUSSION GSD Type 1 is a rare genetic disease, caused by glucose-6-phosphatase (G6Pase) enzyme deficiency, which regulates glucose metabolism . Although science has not any cure for GSD Type 1 , each day new therapeutic methods are developed for various diseases. In this study, R83C/H mutations were found to be common and on the catalytic site of the enzyme among 57 patients from Turkey. To correct this mutation in the G6PC gene or restore Glucose-6-phosphatase, prime editing (PE) can be promising or seem advantageous rather than CRISPR/Cas9 method (Anzalone et al., 2019). CRISPR/Cas9 edits rely on non-homologous end joining (NHEJ) or homology-directed repair (HDR) to fix DNA breaks, while the prime editing system employs DNA mismatch repair. This is an important feature of this technology given that DNA repair mechanisms such as NHEJ and HDR, generate unwanted, random insertions or deletions (INDELs) byproducts which complicate the retrieval of cells carrying the correct edit (Anzalone et al., 2019). Also, prime editors allow all types of substitutions, transitions, and transversions to be inserted into the target sequence and prime editing offers fewer off-target effects compared to CRISPR/Cas9 (Anzalone et al., 2019).
Besides, prime editing allows all 12 possible base to base conversions and it can be used to perform insertions even up to 44 bp and deletions up to 80 bp (Anzalone et al., 2019). For prime editing, it was more efficient or as efficient as homology-directed repair for some human cell lines and different PE systems, PE2, PE3, and PE3b were studied on mouse models (Anzalone et al., 2019;. For our study, p. R83C (c. 247C>T) or p. R83H (c. 248G>A) can be expected to be edited with fewer byproducts than the CRISPR/Cas9 method in hepatocytes which have GSD Type 1a. However, prime editing includes the choices of pegRNA-induced nick locations, sgRNA-induced second nick locations (for PE3 and PE3b systems), PBS lengths, RT template lengths, and which strand to edit first, so these factors should be optimized first for hepatocytes. Also, much additional research is needed to further assess off-target prime editing in a genome-wide manner and to further characterize the extent to which prime editors might affect cells, and to further analyze its immunogenic effects in animal models.
In addition to R83C/H mutations that are proposed to be edited by prime editing, the mRNA delivery method can be an alternative way to restore glucose-6-phosphatase activity for other types of mutations in the population. We can discuss the mRNA delivery method in two ways; one of them is about mRNA itself and the other one is about delivery methods. For mRNA itself, the physical and temporal characteristics of mRNA allow it to be used as a safe genetic material for gene-based therapy that does not require genomic integration . mRNA provides rapid protein expression even in non-dividing and difficult-totransfect cells like dendritic cells and macrophages . Also, mRNA as a genetic element has predictable, consistent protein expression kinetics, especially when compared to DNA transfection following random start time courses. Moreover, Since it does not integrate into the genome, it decreases the risk of carcinogenesis , it can be degraded naturally in the cell so that it assures that its activity is temporary in case of an unwanted situation, mRNA does not need to be transported into the nucleus to be translated , and a single mRNA molecule can be translated into lots of protein molecules . However, mRNA's widespread application in medical research and the development of new therapeutic modalities has been limited by its perceived instability, susceptibility to degradation, inadequate translatability, and immunostimulatory effects . For another drawback compared to prime editing, protein expression by IVT mRNA is transient and GSD Type 1a patients need repeating mRNA transfections into their bodies, so it may take days to retry for continuous expression of reprogramming factors . In this study, mRNA delivery was thought to be an effective treatment method for individuals with GSD Type 1a for eliminating symptoms by long-term use and it can be terminated directly in case of a possible side effect. The modifications made on mRNA and a hepatocyte-targeted delivery machine may help the increment of the effectiveness of this method. For the delivery system of mRNAs, like other nucleic acids (e.g. DNA and siRNA), naked mRNA cannot easily cross the cell membrane on its own and therefore needs delivery systems to increase cell permeability . Viral vectors have been used as mRNA carriers but cell structure may be damaged by them due to potential immunological side effects, toxicity, and vector size limitations . Non-viral strategies such as electroporation, gene gun, and sonoporation have been explored more extensively as mRNA delivery systems . However, although manipulation of cells by mRNA transfection using such approaches is possible, it is rather laborious, expensive, and generally unsuitable for extensive applications . Apart from these delivery methods, mRNA loaded lipid nanoparticles can be promising and allow non-toxic degradable delivery of full-length mRNA to liver hepatocytes . For our study, lipid nanoparticles can be used to deliver full-length Glucose-6phosphatase (G6Pase) mRNA to liver hepatocytes, but the size of lipid nanoparticles and modifications of G6Pase should be optimized against immunogenic effects and for its stability.
We propose that prime editing can edit R83C/H mutations that are seen among nearly 60 percent of GSD Type 1a patients among Turkey, and also mRNA delivery method can recover the enzyme activity in the rest of GSD Type 1a patients. In addition to these two methods, the p. W160*, p. W86* and p. S15* mutations, which cause premature termination in the protein synthesis of Glucose-6-phosphatase (National  . Ataluren is designed to treat nonsense mutations in Duchenne muscular dystrophy (DMD) . DMD is carried on the X chromosome and is a neuromuscular disorder that results from the fact that the DMD gene cannot synthesize dystrophin protein properly . Nearly 15% percent of DMD patients result from nonsense mutations and that's why Ataluren or other drugs can be used for the treatment of DMD ). Possibly, Ataluren or similar drugs that are used for premature termination can be used as well to treat GSD Type 1a with W160*, W86*, and S15* mutations which cover nearly 15% of GSD Type 1a patients. So, Ataluren or similar drugs can be tested experimentally on hepatocytes which have W160*, W86* and S15* mutations to understand whether the full-length synthesis of Glucose-6-phosphatase happens. In our analysis, we did not check the allele distribution among sexes; we have just distributions of different types of mutations among GSD Type 1a patients. If we checked that data, we would interpret allele distributions comprehensively. Also, prime editing and mRNA are just our proposed methods for this paper, but they should need experimental analysis as both in vivo and in vitro.
Also, we have some protein 3D models for both wildtype and R83C variant proteins by using some statistical tools which are I-TASSER, CPHmodels, ExPASy, PHYRE-2, and ROBETTA Nielsen et al., 2010;. These tools give us statistical modeling, so these are not actual 3D models of G6PC, but we can just assume how the tertiary structure of proteins change when they are mutated. However, if 3D models of both wildtype and mutant proteins can be structured by using X-ray crystallography or other similar techniques, these structures can be very important to understand this disease at the molecular level well and also to improve other possible treatments.

Conflict of interest statement
Nothing declared.

Acknowledgments
To begin with, we thank Mrs. Zeynep Demirci for her support and helpful suggestions. Together with Intergen, we sincerely thank each hospital and medical doctor who shared patient data with Intergen and us. Also, as A.M., C.U, S.M., B.Ş.D, and F. N. G, who are members of the TIMELESS Team of RaDiChal 2020, we would like to thank the Rare Disease Challenge (RaDiChal), which is an international contest of creating novel genetic treatment projects for rare diseases, for allowing us to conduct this project and write this paper. Lastly, but not least, we sincerely thank everybody who supported us during this journey.