Gene therapy approach for treating delta F508, G542X and R553X Cystic Fibrosis mutations through the usage of CRISPR prime editing

Cystic Fibrosis is a rare genetic disease that affects the transmission of chloride ions due to mutations in the CFTR (cystic fibrosis transmembrane conductance regulator) gene. Even though there are nearly 2000 mutations identified to be related to the condition, the most common mutation is F508del; deletion of a phenylalanine residue at 508. On the other hand, G542X which is a Class I mutation is also found very commonly and there are no modulator treatments available for it. Furthermore, it was investigated that R553X mutation can as well be corrected simultaneously with G542X mutation. Therefore, the main focus is on designing a gene therapy project that can correct all these three mutations at once by utilizing the prime editing technique via lipid-based delivery. In this way, the mutations can be edited through plasmids that were designed containing 2 pegRNAs and the Cas enzyme. To implement such an approach efficiently, both ex vivo, an animal model, and in vivo steps are to be designed. For the cell line, fibroblasts are selected due to their simplicity and low cost. The animal model of the experiment is determined to be a ferret concerning the high similarity to the human's CFTR protein and finally, the procedure will follow on a direct application in human Cystic Fibrosis patients. The plasmids are thought to be delivered through a cationic liposome that will reach the lungs with the aid of a nebulizer. At the last stage of the experimental procedure, Sanger Sequencing will be done to see if the desired edit within the CFTR has been performed successfully, and Next Generation Sequencing will be executed to see if there has been an off-target mutation in the remainder of the genome. Whereas for detecting the presence and expression of CFTR protein in humans, immunodetection with flow cytometry will be conducted.


Introduction
In the Switzerland of mid-19th century, Friedrich Miescher, a physician and lab worker at the University of Tübingen was working on leukocytes when he came through a previously unknown molecule that would forever change how people had considered heredity before, he had discovered DNA [1] . It was not until almost 100 years later, the time when Watson and Crick confirmed the 3D double helix structure of DNA; allowing all the evidence to be analyzed altogether and to finally reach the visualization of genetic foundations and heritage [2] . Soon enough, many puzzling questions that had been posed to humanity in the form of unexplainable diseases were about to become resolved.
Cystic Fibrosis is one of the most astonishing cases where science resolved the occurrence of a genetic condition takes the storyline back to medieval Europe. The folklore has preserved a prophecy that was at the time believed to be a curse. Any child that would taste like salt upon a kiss on the forehead was destined to die soon as salty taste on the skin was the sign of him being cursed [3] . This disease remained almost mystical until approximately 30-50s but there was however other evidence throughout history. The very first scientific remark of Cystic Fibrosis was written in 1595 within notes of Leiden University in the Netherlands. After performing the autopsy of a young girl that had died at 11 due to malnutrition, Pieter Pauw, the physician of the case concluded that there was a problem with the patient's pancreas. It appeared to be brighter in color and much swollen compared to a pancreas reference. As it was too early within the lifespan of that victim to show any signs of grave lung infection, it was at the time inferred that such a condition was mostly affecting the pancreas [4] . Yet, not until 1938 could Cystic Fibrosis be identified as a genetic defect when scientists came to the conclusion that it was a mutated protein (later to be referred to as CFTR) creating a defect in the transportation of chloride ions that would eventually lead to an increase in the number of sodium ions that are reabsorbed [5] . Following the same logic of deducing data from obtaining several autopsies from children that had died way too young due to malnutrition, Dr. Dorothy Andersen from the US was finally able to name the disorder like cystic fibrosis of the pancreas, that was later to be corrected and regarded as mucoviscidosis as the thick mucus property had become more prominent [6] .
Today, Cystic Fibrosis is defined as a recessive inherited genetic condition where CFTR transmembrane protein turns out as dysfunctional. This means that chloride ions cannot pass through the channel. Whenever chloride is unable to reach the surface of the cells, water is unable to be adhered to. This happens due to an increase in sodium absorption. Said so, such molecular disruption results in the formation of thick mucus that clogs and twigs in exocrine glands due to them producing mucus and especially in the lungs [7] . Before taking into account the diversity and the divergence of the disease such as ranging from a mutant protein to the presence of no CFTR protein at all; it is important to understand the structure of CFTR protein at a rather molecular level.
1.1 Molecular Basis of CFTR Protein A protein can be defined as any polymer constructed on the basis of amino acids linked together via a peptide bond. They have a huge variety of functions which is primarily determined by their amino acid sequence set at the primary structure of the macromolecule [8] . On the molecular basis, CFTR protein, which gives rise from the gene sequence of the same name located on chromosome seven, gets also referred to as cystic fibrosis transmembrane conductance regulator and is a relatively large protein made of approximately 1480 amino acid constituents. CFTR spans the cellular membrane two times through its MSD (membrane-spanning domains) where each of the subunits is constructed based on six alpha helix conformations. Catching up, the CFTR has two ATP binding sites on its cytoplasmic domains also named NBD (nucleotide-binding domains), followed by a phosphorylation site on its regulatory domain denoted as (R). Such a region carries the necessary information related to the phosphorylation procedure of protein kinases such as PKA and/or PKC [9] .
The chloride channel is constructed and depicted by MDS1 and MSD2 domains which both constitute six alpha transmembrane spanning regions shown in the figure as TM1-TM12. Additionally, the two ATP binding sites are depicted as NBD and are located on the cytoplasmic side. R (regulatory domain), where phosphorylation is located on the cytoplasmic side as well [10] .
As a whole, the protein would classically be located in apical sites of the plasma membrane so that it can regulate electrolyte exchange especially the chloride flow. Another essential approach on CFTR studies is its classification as an ABC transporter. This infers that our protein uses ATP as its foundation energy source. They are classically constructed on two different parts one of which has the spanning domain whereas the other constituent (on the cytoplasmic site) is said to be responsible for the ATP hydrolysis having nucleotide motifs that are labeled as Walker sites A and B. All in all, CFTR protein belongs to a significantly versatile class of channels that are encountered at all three domains allowing us to think of ABC transporters as an action tool of solute and ion exchange through the plasma membrane of the cells [11] . Based on experimental evidence, the approximate value for CFTR diameter is thought to be ~5.3 A , the value that is close to t h e size of chloride ions. The same study revealed that the channel happened to be permeable to water and urea too, thus raising questions upon the role of ATP or even whether CFTR would be a pore having permeability to multiple ions [12] .
1.2 Mutations Associated to the Disease and their Classes Statistical data reveals that the leading mutation resulting in exhibiting Cystic Fibrosis symptoms relies on the deletion of phenylalanine amino acid at the 508th position of the code that can be abbreviated into phe508del. It is no doubt the most critical genetic alteration as 70% of homozygous recessive individuals for Cystic Fibrosis do carry the mutation. Surprisingly, it has simultaneously been proved that there are approximately 1900 other mutations within the CFTR gene associated with the disease. This is what certainly makes the disease so heterogeneous as it can exhibit different phenotypic traits in different individuals [13] .
Taking into account all the different mutations and the different symptoms that Cystic Fibrosis can manifest, it is important to highlight that the disease is today classified based on defects that have occurred to the CFTR protein. As the huge number of mutations is also influenced by epigenetic factors, there is still a major classification that categorizes mutations into six major classes [14] . Below, each of their major characteristics is briefly explained.
Class I yields a resulting reduced or totally absent CFTR. This is due to encountering a stop codon earlier compared to the wild type code and this is attributed to nonsense mutations as well as frameshifts and/or splicing. The second class, on the other hand, does produce a CFTR protein that is misfolded and unable to be fully corrected by its respective chaperone. This causes the protein to undergo ERAD, thus degrading the peptide sequences and giving rise to significantly lower CFTRs within the cells. Class III is the one encountering mutations that have impaired gating as the major outcome. This means that the channel will be facing difficulties in opening properly upon chloride ion recognition. In the case of Class IV, there is expected to be decreased conductance as there will possibly be impediments throughout the ion passage in the transmembrane. In the case of the last two classifications, there is expected to be facing abnormalities in splicing (class V) leading to less protein, and reduced stability in the case of class VI, where the CFTR channel directly is affected. The first three classes exhibit much more severe symptoms when compared to class IV-VI [14] . Figure 2 creates a much better visualization of what is actually happening to each of the mutations [15] .
When regarding this issue on the macro level, it is essential to understand that even the tiniest mutations, such as fdel508 can have a tremendous effect on the body. The major consideration relies on the fact that Cystic Fibrosis can affect many organs simultaneously. Additionally, its divergence in terms of symptoms requires treatment that is specific and pretty much tailored to a certain individual, especially in preventing further progress of the symptoms and chronic diseases such as lung infections. However, the most common symptoms regardless of mutation that a person with Cystic Fibrosis would exhibit include having skin that significantly tastes like salt, catching pneumonia and/or other lung infections too quickly and frequently, exhibiting shortness of breath, having difficulties in gaining weight, as well as experiencing bowel movements in the intestines [7] . It is important to implement early detection of the disease so that adequate medication is provided as fast as possible, and degeneration of the symptoms is impeded.
1.3 Testing and Classical Treatments of the Disease Firstly, application of genetic testing and/or application of pedigrees are recommended to be performed, especially by future parents whose CFTR mutation gene is known to run through their blood. Even though this approach will not cure or guarantee that the offspring is cystic fibrosis-free, it is for sure a great estimate of the probability of encountering the recessive mutation. Unless Cystic Fibrosis is included in the newborn screening performed by the hospital, a sweat test, where Cl-ion concentration in the sweat is measured via electrical stimulation and a filter paper. It is important to continue screening even if the patient results as intermediate in the test. Lastly, one of the most reliable DNA testing would be an implementation of IRT-DNA by collecting a blood sample from the patient. The procedure is quite efficient as it not only gets the IRT (immunoreactive trypsinogen) levels checked but also screens for mutation in the CFTR gene. Once the patient has been examined and identified as positive, several tests have to be taken routinely such as frequent X-rays to the lungs, blood tests, sputum cultures, etc. [16] .
The next issue to be taken into account throughout this paper is the chronic conditions of Cystic Fibrosis patients. Since CFTR transmembrane channel is expressed almost throughout the body regions, the disease is expected to manifest several different effects on the body. As the disease tends to aggravate over time, even though it is uncommon to observe severe lung damage in young patients, adults are expected to have persistent lung infections accompanied by cough and wheezing. All these are attributed to mucus clogging up the lungs and creating a favorable environment for bacteria to quickly divide and invade the tissue. As exocrine cells are primarily affected by the defective CFTR gene, it is expected for patients to also be suffering from malnutrition as well as pancreatic inflammation. In males, infertility is encountered, and other later onsets of the disease include arthritis and/or sinusitis [17] .
Even though this paper focuses on finding a genetic therapy solution that would once and for all reverse the mutation in the cells where the therapeutic procedure got implemented, one last remark shall be made upon modulator medications in the context of Cystic Fibrosis.
A modulator refers to a medical therapy that targets the genetic defect (in this case, the CFTR protein), instead of dealing with the symptoms of the disease instead. The first category would be including potentiators. With Ivacaftor being the major therapy pathway, the transmembrane channel within the CFTR protein is expected to increase the diameter thus, facilitating the chloride ions to pass through the channel. It is precisely efficient in individuals having the Gly551Asp Cystic Fibrosis mutation. Correctors are another class of Cystic Fibrosis modulators. Drugs like lumacaftor aim on ameliorating the CFTR trafficking mainly in the f508del mutations. Scientists are actively working on producing effective drugs that are a combination of modulators and function on different mutations simultaneously [18] .

Main Principles on Gene Therapy
In the second part of the session, it is important to mention the essentials of gene therapy principles and advances. To begin with, genetic engineering is regarded as the popular and incontrovertibly proceeded field of genetics. The 1980s was the timeframe that scientists tried to utilize human DNA for solving and curing different kinds of human genetic problems [19] . This field is generally based on creating a targeted doublestranded break in the required part on the chromosome. The progression of genetic engineering has gained speed due to the increasing information and experiments about restriction enzymes and recombinant DNA technology. Despite the presence of ethical issues about genetic engineering, in 1981 first transgenic animal (a mouse) was obtained by transferring the gene of a rabbit into the mouse via DNA microinjection, performed by Thomas Wagner in Ohio [19] . After such a study, a revolution in transgenic animal experiments occurred. On the other hand, in 1982 world's first genetically engineered drug insulin (also known as Humulin) was released to the market for humans and used for curing patients having Type I diabetes [19] .
Genetic engineering then started developing more as years went by, and different genome editing techniques were introduced. The first discovered genome editing technique was Zinc-finger nuclease (ZNF). ZNFs are widely used hybrids between specially-designed Cys2-His2 zinc-finger protein and the cleavage domain of the FokI restriction endonuclease [20] . ZFNs function as dimers and due to the zincfinger DNA-binding domain, each monomer recognizes a specific sequence typically 9 to 18 base pairs of DNAs. ZFN protein's dimerization is provided by the FokI cleavage domain, which cleaves DNA within a five-to seven base pair spacer sequence. ZFNs generally consist of 3-4 zinc finger domains and each domain consists of nearly 30 amino acid residues which are arranged as ββα motif. These residues facilitate DNA recognition, are placed in the α-helical domain, and often interact with three base pairs of DNAs due to the overlap from an adjacent domain. There are many zinc finger domains can recognize a different kind of DNA triplets. Polydactyl zinc-finger proteins which are able to target a wide range of DNA sequences can be obtained by fusing these domains in tandem using a canonical linker peptide [20] .
As still being present in all genome editing techniques, in the case of ZFNs, the main concern and risk are the off-target mutations. For decreasing the risk, some improvements about enhancing the specificity were done as preventing the unwanted dimerization of cleavage domains of FokI. Also, the cleavage efficiency of FokI was increased by utilizing the protein-engineering methods [20] . Contrary to TALEN and CRISPR techniques in ZFNs, forming zinc finger arrays is difficult, this is why it is not considered as very practical and flexible to be used in laboratories.
It was not until the studies upon the bacteria species of the Xanthomonas genus (pathogens that pose damage to agriculture) when ZFNs were the only genome editing tool. However, after some time development of TALEN got emerged. The bacteria produce a secretion of effector proteins which becomes the plant more susceptible to the pathogen (transcription activator-like effectors, TALEs) to the cytoplasm of plant cells. Further studies related to these effector proteins implied that they can mimic the eukaryotic transcription factors and activate the expression of interesting genes. TALE proteins can bind DNA and consist of the central domain, nuclear localization signal, and an activating domain for activating the gene expression [21] . DNA binding domain includes 33-34 amino acid sequences with divergence in 12th and 13th amino acids. These positions are called "Repeat Variable Di-residue" (RVD); they show variability and have a strong correlation with the specificity of recognition of nucleotides. FokI endonuclease's non-specific DNA cleavage domain can be used for creating hybrid nucleases which are active in various kinds of cell types. Engineering with TALEN is based on the relationship between DNA recognition of the TALE binding domain and amino acid sequence. After TALEN's components assemble, they are placed in plasmids, used for transfection of targeted cells, gene expression is done, and enter the nucleus. They can be also transferred with mRNA's which cannot allow the genomic integration of expressed gene products [22] . TALE nucleases called TALENs have been used in genome editing successfully for introducing intended mutations due to the repair of double-stranded breaks as nonhomologous end joining (NHEJ) or by homologydirected repair (HDR) in zebrafish. Considering the comparison with other genome-editing nucleases, TALENs have high, strong binding specificity in the repair of doublestranded breaks NHEJ or HDR. Also, in TALENs fewer sequence constraints selected the genome due to the comparable mutagenic activity [23] .
Today's most popular and preferable genome editing tool which called CRISPR Cas9 has gained a Nobel prize in chemistry attributed to Emmanuelle Charpentier and Jennifer Doudna in 2020. Actually, CRISPR/Cas9 systems are adaptive immune systems for prokaryotes that defend them against viruses. The main function of the system is the cleavage of foreign nucleic acids of viruses in prokaryotes and prevent viral infections [24] . CRISPR's (Clustered Regularly Interspaced Short Palindromic Repeats), are short repeated DNA sequences found in genomes of archaea and bacteria and they were first discovered in 1987 by Japanese scientists Yoshizumi Ishino, in the E. coli genome. Unfortunately, the function of these repeated sequences could not be understood as lacking the data about the DNA sequence. However, in 1993 J.D. van Embden and his colleagues realized that different strains of Mycobacterium tuberculosis had different spacer sequences between the repeats of DNA. Based on this founding, they classified characterized Mycobacterium tuberculosis strains according to the spacer sequences with spoligotyping. Furthermore, these spacer sequences were observed in genomes of other bacteria and archaea and Francisco Mojica and Ruud Jansen referred to these sequences as CRISPRs [24] .
Progression of information and development of CRISPR systems has occurred with discovering the presence of similar repeated sequences in halophilic archaea by Mojica and his coworkers during a study related to the adaptation mechanisms of highly salty environments in halophilic archaea. Researchers thought that these sequences had an important role in gene expression regulation and they facilitate the conversion of the double-stranded DNA from B to Z form to provide binding of the regulator protein [25] . On the other hand, due to the efficient machines that were invented in the 1990s; genome sequencing became accessible and these repeated sequences were identified in bacteria and archaea. The sequences were observed in nearly all archaeal genomes and about half of the bacterial genomes. Also, in the light of the analysis of the genomic sequence; properties of CRISPR as a location in intergenic regions, consisting of short repeats with a little variation, non-conserved distribution of the repeats, and presence of a common leader sequence which is placed on the one side of the clustered repeats were found [25] . The function of these repeats was understood when the relation between Cas genes and repeats was discovered. In 2002, it was identified that CRISPR loci are closely linked to a cluster of genes called Cas genes. These genes are also found in both archaeal and bacterial genomes near the CRISPR loci. Cas genes were found as genes that encode helicases and nucleases and could join in the metabolism of DNA [26] . After these founding's mechanism of CRISPR/Cas system in prokaryotes was understood well. Cas proteins are responsible for cutting the foreign DNA found in prokaryotes into smaller fragments as 20 base pairs and pasting them into contiguous stretches of DNA as CRISPR. Different Cas proteins after handle and express the loci of CRISPR in order to create the CRISPR RNAs (crRNA). Also, crRNA and tracrRNA can be fused together to form a single-guide RNA (sgRNA).
Via the homology of sequences into contiguous stretches, crRNAs guide the Cas nuclease to exogenous genetic material that involves a species-specific sequence known as a protospacer adjacent motif (PAM). The complex of CRISPR binds the foreign DNA and destroys it by cleavage to save the prokaryote from the virus [27] . CRISPR/Cas9 system has been used for both nonhomologous end-joining and homology-directed repair for generating genetically engineered and have specifically designed mutations eukaryotic cells due to the system's flexibility and ease of application. For instance, obtaining transgenic mice was successfully performed with the CRISPR system by directly injecting sgRNA and mRNA encoding Cas9 into embryos. Cas's gene family has a lot of members and different Cas proteins have different properties for utilizing them according to the required characteristics which are necessary for selected organisms [28] . Cas9 nucleases have some superiority compared to ZFNs and TALENs. The targeting efficiency of Cas9 is higher than the other techniques, retargeting and customization are easier, and facilitation for targeting and editing multiple loci in the genome simultaneously is possible with the combination of sgRNAs. However, Cas9 nucleases also have a risk for offtarget mutations as other nucleases and there are ethical issues to using them in the human genome [28] .
CRISPR/Cas systems are still tried to develop and become less risky. Improvement of new CRISPR-Cas genome editing tools become genome editing more advanced. CRISPR/Cas derived genome editing agents can be classified into 4 groups called nucleases, base editors, transposases/recombinases, and prime editors. Each of them brings new advancements as increased specificity, opening a new door for science but also bringing some limitations [29] . Base editors include 2 main components a Cas enzyme for programmable DNA binding and a single-stranded DNA modifying enzyme for targeted nucleotide changing. There are 2 classes called adenine base editors and cytosine base editors. C→T, T→C, A→G, and G→A mutations can be corrected by utilizing the base editors and now for high efficiency, dual base-editor systems for combinatorial editing in human cells are at the issue [30] .
Beyond the capability of Cas nucleases and other new tools of the CRISPR Cas system, precision of the genome editing is tried to be increased and become more efficient with prime editing technique which is selected for BeeWare's project. Unlike the CRISPR/Cas systems, the prime editing technique does not base on double-stranded breaks. Prime editors use the fusion of engineered reverse transcriptase, Cas9 nickase, and a prime-editing guide RNA (pegRNA). PegRNAs are different from sgRNAs. It contains both complementary sequences to the targeted site that directs nCas9 and additional sequence spelling the required sequence changes [30] . Referring to the mechanism of prime editors, the first 5′ of the pegRNA binds to the primer binding site (PBS) region on DNA, exposing the non-complementary strand. Cas9 nicked the unbound DNA of the PAM-containing strand and a primer is generated for the reverse transcriptase (RT) that is linked to nCas9. The nicked PAM-strand is then extended by the reverse transcriptase by using the interior of the pegRNA as a template and modifying the targeted region in a programmable way. This step results in 2 unnecessary PAM DNA flaps called edited 3′ flap which was reverse transcribed from the pegRNA and the original nonedited 5′ flap. Generally, 5′ flaps are favorably degraded by cellular endonucleases that are ubiquitous during the lagging strand of DNA synthesis. At last; the resulting heteroduplex involving the original (unedited) strand and edited 3′ flap, is resolved and integrated into the host genome through cellular replication and repair process in a stable manner [31] .
Today's thought in many diseases especially the rare genetic diseases can be treated by using prime editing technology. For instance, a study about prime editing on the mutation that causes SCD and Tay-Sachs diseases reported that the maximum and optimized correction was 58% and indel was 1.4% for SCD. On the other hand, the study for Tay Sachs disease resulted in 33% efficiency and 0.32% indels with the PE3b device [31] . Still, the studies about prime editing techniques are continuing and the information and trials are improved. It is thought that in the future most of the genetic diseases will be disappeared by using prime editors.

Methods
The main idea of such a review project is on how we can implement CRISPR-Cas9 prime editing in order to design two PegRNAs for three different mutations; delta F508, G542X, and R553X. The reason why such mutations are specifically chosen in this project is because of them having a high frequency of occurrence in Turkey.
Based on literature data, PE3 is shown to introduce a single base change up to 34 bp downstream of the cleavage site. So it can be said that NGG PAM can be positioned far away from the target site. Thus, it allows a single pegRNA to correct for different mutations in gene hotspots such as G542X and R553X since these single nucleotide variants are as close as 33 bp to each other [32] Therefore, one pegRNA would be designed to target delta F50, and the other pegRNA would target both G542X and R553X. This possible approach was also proposed as a future aspect in literature since none of these two mutations have a CFTR modulator treatment [32] (Genomic locations: G542X: Chr7:117587778 and R553X: Chr7: 117587811) [33,34] However, any pegRNA designed to target these two regions together would have the minimum length of 34 bp for the sgRNA part. As the BeeWare team, we have decided to take the efficiency risk and try whether a single pegRNA could target and correct both G542X and R553X. The other pegRNA designed would target Delta F508 meantime.
The PE2 system is chosen to perform prime editing due to its increased efficacy and compatibility with shorter PBS sequences when compared to the P1 system [35] In order to design dual-targeting pegRNA, the FASTA sequence is used and added to the website called "PegFinder" manually [36] The website recommends the optimal selections for pegRNA design. The recommended selections and full-length pegRNA can be seen in Figure 1. Furthermore, the candidate Primary editing sgRNAs, RT templates, PBS sequences are shown on the website as seen in Figures 2 and 3. In Figure 4 the pegRNA for Delta F508 can be observed. Prime editing design tool was used in order to obtain Delta F508 pegRNA [37] . pegRNA [36] .

Figure 2:
Displays candidate Primary editing sgRNAs [36] . [36] . In order to perform prime editing with the PE2 system, 2 plasmids are required for the components of PE2. A plasmid to generate pegRNAs and another plasmid for the Cas9 enzyme. For this purpose, a plasmid was designed in a way that it can contain 2 pegRNAs. The other plasmid is planned to contain Cas9 (H840A). The plasmid that contains Cas9 (H840A) is pCMV-PE2, which will be ordered from David Liu's lab (Addgene plasmid # 132775) (seen in Figure 5). For pegRNAs, pU6-pegRNA-GG-acceptor plasmid will be ordered from David Liu's lab (Addgene plasmid # 132777) as it is also seen in Figure 6 [35] . Oligos to ligation clone the pegRNA into Addgene #132777 are shown in Figure 7. Cloning will be performed according to the described protocols of the plasmids [35] . First of all, pU6-pegRNA-GG-Vector plasmid will be digested with BsaI and the origin of replication, U6 promoter, U6 poly-T termination sequence, and AmpR gene containing fragments of the plasmid will be isolated. These ingredients will be combined in a PCR tube and will be incubated at 37 °C for 4-16 hours, to isolate approximately 2.2-kb fragment:

Figure 3: Displays candidate RT templates and PBS sequences
Afterward, oligonucleotide parts will be ordered and annealed. Oligonucleotides will be ordered from Integrated DNA Technologies. Annealing buffer will be prepared by using H2O complemented with 10 mM Tris-Cl pH 8.5 and 50 mM NaCl. These ingredients will be combined in a PCR tube, will be heated in a thermocycler at 95 °C for 3 minutes then cooled down to 22 °C: Component 2: The desired spacer (target) sequence flanked by indicated overhangs Component 3: The desired pegRNA 3' extension template flanked by the indicated overhangs Component 4: SpCas9 sgRNA scaffold sequence featuring compatible golden gate overhangs Afterward, annealed oligonucleotides will be diluted 1:4 by adding 75 µL H2O. Every oligonucleotide will have a 1 µM final concentration. To phosphorylate sgRNA scaffold the following reaction will be prepared in a thermocycler, will be incubated at 37 °C for 60 minutes: For pegRNA assembly, Golden Gate reaction will be followed. Following reactants will be incubated at room temperature for 10 minutes or cycled between 5 minutes at 16 ˚C and 5 minutes at 37 ˚C for 8 cycles: After incubation, thermocycler will be used for 15 min at 37 °C, then 15 min at 80 °C, then will be held at 12 °C. Later, 1 µL of the 10 µL assembly reaction will be transformed into 10 µL of competent cells such as E. coli. The antibiotic resistance from the pU6-pegRNA-GG-vector plasmid is ampicillin and carbenicillin resistance, therefore transformants will be resistant to these antibiotics and could easily be selected. pegRNA cloning protocol is summarized in Figure 8.  [35] .
The open scheme of the plasmid planned for the review with the pegRNAs assembled can be seen in Figure 9. The next main issue of the project is finding an efficient way for transferring the generated plasmids into the cells of interest. In order to regard to such approaches efficiently, it is important to use a specific technique for each stage of the project. To initiate our project, we will use cultures cells; more precisely fibroblasts. This is supposed to be achieved via skin biopsy of volunteer CF patients in Turkey that do carry the precise mutations which our project targets. Since this will be an in vitro procedure, according to literature, a plasmid can easily be transfected through pathways like electroporation, nucleofection, or even through a lipofectamine-mediated transfecting (lipid-based) way [38] . However, our project is supposed to only use a non-viral vector/ liposomal-based approach for both the in vitro and in vivo stages.
While checking the literature for plasmid delivery, two main issues were encountered. First, the efficacy of the system and immunogenicity. In the case of Cystic Fibrosis, epithelial cells of the lungs are the main target for plasmid delivery. However, the condition of such patients is characterized by persistent bacterial infections, lung damage as well as dehydrated mucus [39] . Logically, usage of a viral vector, especially some form of adenovirus would be highly favored not only in terms of capacity but most importantly in terms of airway tropism, making it way easier to achieve effective plasmid transduction [39] . Nonetheless, there is a major issue related to such choice, it is prone to causing inflammation and such immunological response is definitely not recommended from CF patients, especially when knowing that the attack will be in the lungs specifically. Said so, even though a viral vector might be the right choice for a CRISPR/Cas9 system, when considering the special conditions of Cystic Fibrosis, the highlight is made in using non-viral vectors and more specifically liposomes. In our project, it is expected for two plasmids to be transfected simultaneously in the targets of interest. This means that we will need to find a liposome type whose size is actually big enough to confer all two plasmids simultaneously. In the second stage of our work, we are expected to work with in vivo applications of gene therapy, first in ferrets and then in humans. Nasal epithelium delivery is expected to be the first pathway for the delivery of cationic liposomes and possibly via the use of a nebulizer. The easy part about the gene delivery procedure in the case of Cystic Fibrosis is the fact that the disease is considered as monogenic, meaning that only targeting the lungs would result into an immersive amelioration and possible cure for the disease.
When considering the cationic lipids as delivery tools, we went for the cholesterol-based ones that were also found to contain polyamide and/or ether linkage within their structure. In order to save the nucleic acids of plasmids from degradation, we can use non-viral pathways. They are easily synthesized, considered as safe tools, and are known to only generate a low immune response [41] . In the case of Cystic Fibrosis, taking into account the already present inflammation in the lungs and many other factors, the implementation of liposomes can be considered as the most efficient pathway.
A cationic lipid is expected to have a hydrophobic domain to its structure that is linked to the cationic head group [41] . The cation will be the portion of the liposome responsible for caring the DNA. As the plasmid DNA is expected to have a negative charge, a charge-based interaction is expected to occur with amines located in the cationic part of the liposome. Ideally, primary amine head groups biding to DNA would be the most effective among all as they have a stronger positive charge compared to others [41] . Additionally, in order to have increased efficiency a cholesterol backbone containing ether linkage is expected to be part of the liposome, this also when considering the low toxicity of cholesterol when compared to other compounds [42] . Knowing which non-viral vector will be used in the experiment, the only issue left is that of size.
DOTAP/DOPE is a commercial cationic liposome considered as pretty effective in terms of transfection. The ratio between DOTAP/DOPE is to be checked in terms of size and transfection efficacy. We need to have the appropriate size distribution and zeta potential in order for both plasmids to fit into the structure [42] . At first, the created plasmids from the the previous section is amplified into laboratory E. coli strains, and once the desired plasmid amount is reached, DNA gets to be purified. DOTAP/DOPE can be bought as a transfection reagent and it is found in powder, lyophilized form. According to instructions, the compound can be used for DNA transfection in fibroblasts and it can as well be nebulized [43] . Preparation protocol suggests maintaining a 1 µg DNA per 10 µg lipid ratio. And such simple relation makes it possible to even fit more than one plasmid/ large size plasmid; and as simple as that the now formed cationic lipid is made ready for transfection. In the case of in vitro fibroblast transfection, it is enough for the liposome preparation to be washed twice and then treated with HEPESbuffered saline. The cells of interest, in our case fibroblasts, are incubated together with the buffer solution and the formed liposomes [44] . However, in the in vitro stages of the project, the approach is rather different.
What is thought to be done in such a case, is nebulizing the formed liposomes so that the plasmids reach the lungs through inhalation therapy and easily provide a solution to the genetic disorder. Literature data still shows that even though liposomes have the potential to get their inner plasmids damaged when undergoing the nebulization procedure, DOPE-based complexes are still the most stable amongst all structures [44] . When considering drug delivery in the lungs, there are three main issues to be taken into account the anatomy of the lung itself that it is a natural defense mechanism, a pathological barrier which in the case of Cystic Fibrosis patients is really prominent and at a state of almost chronical infection and the immunological barrier mainly consisting of the macrophage's action [45] . Additionally, the inhalable liposome formulation is suggested to be at least smaller than 5-6 µm and have an adjustable FPF (fine particle fraction) so that it can safely reach up to bronchioles and alveoli [45] .
The nebulizing process can be obtained through different inhaler devices such as pressurized metered-dose inhalers, dry powder inhalers, medical nebulizers, etc. [46] . A study conducted in 2015 focused on gene therapy of Cystic Fibrosis via nebulization of CFTR gene administered through non-viral vectors used Trudell Aero-Eclipse II to generate a nasal spray dose for the CF patients. The cationic formed liposome solution was simply poured into the device (5 mL) [47] . However, so far it seems like a wide range of nebulizers can be used for such a procedure, so even another nebulizer found in Turkey would perform the procedure properly.
For the project to fully pass all the possible trials the gene therapy testing shall be done in both in vivo and ex vivo systems. Said so, when concerning the implementation of animal models, it had been observed that a CFTR knockout mouse model has failed to model the spontaneous progression of human CF lung disease. For this reason, the ferret was considered to be used in this experiment as it is better reflective than the mouse in cystic fibrosis since the CFTR amino acid identity of the ferret is 92% similar to humans' and ferrets and humans share all major cell types in the airways, including basal cells, ciliary cells, goblet cells, and intermediate cells, as well as the distribution of submucosal arteries [48][49] . For example, in humans and ferrets, protein conservatism is much higher than in humans and mice for several classes of proteins that are potentially important in pneumonia and remodeling [48] .
The submucosal glands of human and ferret cartilaginous airways express very high levels of CFTR mRNA andprotein [49] This similarity is an important conceptual feature that selects this kind as a CF model. The CF ferret summarizes nearly all phenotypes observed in human patients, from inflammatory and infectious lung disease to CF-related diabetes. Also, since the ferret has a gestation period of 42 days and reaches sexual maturity at 4-6 months, it has obvious advantages for animal modeling over larger species [50] .
Since creating CF ferrets with all 3 mutations that we will work on requires a lot of time and resources, the site we will use to obtain CF ferrets with the mutations we want is the National Ferret Resource and Research Centre. (https://medicine.uiowa.edu/nfrc/services).
Thanks to the service of this site specializing in generating genetic ferret models, we will reduce the time of our experiments. But in the event of a hurdle, we'll also have a backup plan to build our own CF knockout ferrets. To briefly explain the method, in the past, somatic cell nuclear transfer was performed by genetically manipulating fibroblast cells to generate knockout ferrets. However, since this method is both time-consuming and expensive, new methods have been sought over time [48] . Recently, the CRISPR/Cas9 system has been used to generate knockout ferrets. If knockout ferrets are not supplied from the institution mentioned before, this method will be used. First, a certain number of one-cell stage fertilized eggs will be collected [48] Since this process is a surgical operation that needs to be entered through the ferret's abdomen, the non-surgical embryo collection and transfer successfully performed at Cornell University will be used [50] . A liquid medium will be injected into the donor ferret's uterus. The fluid drains the embryos from the catheter into the collection cups. Viable embryos are selected by microscopic examination and kept in a nutrient solution [50] .
The method to be used to create the desired mutation is the CRISPR/Cas9 system. After selecting the sgRNAs suitable for the 3 mutations (F508, G542X, and R553X) to be created, Cas9 mRNA and sgRNAs will be injected into the collected eggs at the one-cell stage. The zygotes injected with the CRISPR system will then be transferred to the surrogate mothers by the non-surgical method described above [48] [51] .
After the hatchlings are born, Genomic DNAs from the placenta and tail will be isolated, and targeted genome modifications and regions around the CFTR gene will be screened by PCR amplification [52] .
Lastly, in order to check for the viability and the efficacy of all the trials done up to that point, the presence and synthesis of a functional CFTR protein shall be detected via an immunodetection method.
For this purpose, ELISA, Western Blot, and Flow Cytometry techniques were evaluated and investigated. ELISA is based on the combination of specificity property of antigens and antibodies with a coupled enzyme that can be analyzed with simple enzyme assays [53] There are 4 types of ELISA called direct ELISA, indirect ELISA, sandwich ELISA and competitive ELISA [54] . ELISA can be used for both detecting the presence of these molecules or concentration of the molecules. ELISA includes 2 variations principally. Firstly, ELISA can be selected for detecting the presence of a specific antigen molecule that is recognized by an antibody. Secondly, ELISA can be used for examining an antibody that recognizes the specific antigen [53] Arranging an ELISA requires multiple steps as coating of micro-titer plate wells with antigen or antibody, deactivating the unbound binding sites for avoiding false-positive results, the addition of antibody or antigen depending on the variation of ELISA that is selected, the addition of a specific antibody that is connected to an enzyme and enzyme-substrate reaction for obtaining a colorful product to understand the positive result. The most important step for detection with ELISA is the interaction of antigen and antibody [53] ELISA is preferred as it has advantages like being a simple procedure, including high specificity, being safe and environmentally friendly, providing high efficiency, permitting making detection with paying a lower cost and can be performed without radioactive substances and large amounts of organic solvents. On the other hand, this technique also has disadvantages as difficulty and high cost of preparing an antibody, requirement of expensive culture media and usage of complex techniques, instability of antibody, the requirement of storing and refrigerating the antibody, and high possibility of falsepositive or negative results [55] .
Western Blot technique is an important procedure in molecular biology for identifying and separating specifically determined protein from a mixture of complex proteins that are taken from cells. For identification, western blotting consists of 3 steps as separation depending on the size, transferring to a solid support, labeling the targeted protein by utilizing an appropriate primary and secondary antibody for visualization [56] . Western Blot is both used for evaluating the size of the targeted protein and measuring the expression of the targeted protein. The first stage of Western blotting includes preparing of protein sample and obtaining unfolded proteins into the linear chain and coated with the negative charges by adding a detergent into the sample called sodium dodecyl sulfate. In the second stage, the size separation of protein molecules is actualized with gel electrophoresis. Next, the transposition of proteins is performed from the gel onto a blotting membrane. In the beginning, the membrane includes all of the bands of the protein found on the gel previously. Then, a blocking treatment for preventing the nonspecific reactions is performed on the membrane. After that, incubation of the membrane with the primary antibody that particularly binds to the protein of interest is done. Later on, washing away of the unbounded primary antibodies and second incubation of the membrane with the second antibody which recognizes specifically to primary antibodies are realized. Second antibodies are connected with a reporter enzyme able to produce color or light that allows easy detection and visualization [57] Although Western blotting is a powerful, highly specific, sensitive, and common technique for cell and molecular biology there are some limitations as can possible only there is a primary antibody which is specific for the interested protein, high cost of antibodies, off-target effects because of the interaction between the antibody and other proteins except the targeted one, requires expert scientist, expensiveness of equipment of western blotting and methods for antibody optimization [58] Thus, in order to perform an ideal and correct western blotting, quality of primary antibody, selection of the proper normalization standard, preventing the saturation of signal and accurate quantification of the intensity of the signal of interested protein and defining a linear range for antibodies related to the targeted protein have importance [58] .
Flow cytometry has been used for several decades in biological fields for measuring the expression of proteins. Flow cytometry is basically an antibody-based method that enables the identification of proteins which are expressed on surfaces of cells as well as in the cytoplasm. Flow cytometry is useful and the most appropriate method for analyzing the specific, interested cell type found in the heterogeneous population and can be used for the identification of cells that respond to some treatments based on the expression of the proper receptor [59] Additionally, multicolor flow cytometry enables measuring multiple proteins in parallel, regardless of the protein length rapidly. Flow cytometer is a powerful technique to get rid of the problems found in ELISA and Western blotting. Flow cytometry requires lower cost, less time for processing, and lower necessities on the number of input cells with respect to the Western Blotting. In addition, Flow Cytometry shows a better correlation compared to the ELISA as it includes antibodies that have protective neutralizing activity [60] .
The general principle of flow cytometry can be summarized as separation of a sample through a narrow flowing stream of liquid by passing through a laser which enables detection of size, granularity, and fluorescent properties of each cell or particularly found in the sample [61] First, required fluorescently labeled sample is linked to the flow cytometer and the particles of the sample are directed by hydrodynamic focusing into a thin stream surrounded by fluid of higher speed. A laser beam is orientated onto the stream of fluid and different the number of detectors is guided at same point with a beam of laser. Some detectors sense light in line with the light beam, Forward Scatter (FSC), and perpendicular to it, Side Scatter (SSC), and as well as one or more sets of fluorescent molecules. Detectors perceive the combination of scattered light and fluorescent light coming from chemicals found inside the particle. Evaluation of brightness fluctuations for each detector provides information about the physical and chemical structure and properties of each particle [61] . Flow cytometer has advantages as being rapid and quantitative, has multiparameter, high accuracy and high specificity, and identification of viable and dead cells. On the other hand, this technique has disadvantages as being technically challenging, having no visual confirmation of cell specificity, and limited sensitivity [62] .
Considering the drawbacks and advantages of 3 techniques, Flow cytometry is selected and decided to use for the detection of CFTR protein expression in our project's approach because of the reliability, simplicity, higher correlation, and lower cost. Besides, there are other researches that support using flow cytometry as the best choice for detecting the CFTR protein in cystic fibrosis which is used for nasal epithelial cells and leukocytes in order to detect the expression of CFTR protein with flow cytometry.
With an ethical way, taking nasal epithelial cells from cystic fibrosis patients and healthy patients and making indirect staining with using isotype-specific secondary antibodies by combining mIgG1 anti-pan-Cytokeratin with non-mIgG1 CFTR mAbs and mIgG2a anti-E cadherin with CFTR-specific mIgG1 mAbs, staining patterns can be analysed. For this purpose, taken nasal epithelial cells are collected in DMEM and centrifuged at 800xg for 5 min, at 4°C.After, to obtain single-cell suspensions, they are resuspended in PBS and EDTA 5 mM for 15 minutes and filtered through cup Filcons 50 µm. At last, cell counting by flow cytometry using flow-count beads is performed. In this experimental procedure CFTR-specific antibodies as monoclonal antibody (mAb) 24-1 (mouse IgG2a) , mAb L12B4 (mIgG2a) and mAb M3A7 (mIgG1), mAbs 217, 432, 450 and 570 (mIgG1) 596 (mIgG2b) and 769 mAb (mIgG1), mIgG1 mAb anti pan-Cytokeratin , mAb mIgG2a anti-E cadherin is procured and used [63] .
Again, ethically, the second choice of detecting CFTR expression is taking leukocytes from patients who have cystic fibrosis and healthy people. This detection is based on taking aliquots of peripheral whole blood from patients and healthy volunteers and incubating with specific fluorescent probes recognizing CFTR protein expressed on the plasma membrane of leukocytes. Protein immunoprecipitation in this experimental procedure can be done by rabbit antibodies and Protein G Sepharose. Also, lysis of red blood cells, centrifugation of the sample, and incubation with a mouse anti-CD14 antibody conjugated with the tandem fluorophores PE-Cy7 (λnm = 488ex/>750em) or with allophycocyanin fluorochrome (λnm = 635ex/670em)and cell washing and fixation are other necessary sections of this experimental procedure [64] .
As seen in other researches, both nasal epithelial cells and leukocytes can be utilized for performing the procedure of flow cytometry. The thought was of first taking nasal epithelial cells from patients who have desired mutations as Fdel508&G542X and R553X after the gene therapy was delivered. Also, at the same time taking nasal epithelial cells from healthy volunteers is realized, and after the experimental procedure -that is referred on the above-is applied the results were compared and the efficiency of the gene therapy can be understood. The second part consists of applying the CFTR expression analysis with flow cytometry using leukocytes according to the experimental procedure that is explained before. First, peripheral blood is taken from patients who have Fdel508&G542X and R553X mutations after the gene therapy is delivered and these expression results are compared with other healthy people to understand it is availment. At last, we thought of comparing the results that are obtained with nasal epithelial cells and leukocytes to secure and ensure the efficacy of our gene therapy.

Conclusion
Prime editing will be performed by targeting fdel508, G542X, and R553X by designing 2 PegRNAs for 3 mutations of cystic fibrosis. PegRNAs will be chosen with the higher G-C content and lower relative nick scores. To deliver the plasmids, liposomes will be used due to their advantages on capacity and lower immune response. Since in vivo clinical trial safety data with no serious side effects are available, although delivery efficiency is low, we are planning to design both ex vivo and animal trials with lipid-based approaches. Sanger Sequencing and Next Generation Sequencing will be performed for validation and efficacy. Fibroblast is the cell line of the project that is used in the ex vivo stage of the experiment. The ferret will be the animal model on the next steps since the symptoms of the knockout ferret match with patients at a higher rate. An immunodetection method flow cytometry is the selected technique for checking the expression of the CFTR protein at the last stage.