Candidatus Liberibacter asiaticus manipulates the expression of vitellogenin, cytoskeleton, and endocytotic pathway-related genes to become circulative in its vector, Diaphorina citri (Hemiptera: Psyllidae) Running title: Transcriptomic response of Asian citrus psyllid to huanglongbing

BACKGROUND Citrus greening disease or huanglongbing (HLB) caused by Candidatus Liberibacter asiaticus (CLas) limits the citrus production worldwide. CLas is transmitted by the Asian citrus psyllid (ACP), Diaphorina citri (Hemiptera: Psyllidae) in a persistent-propagative manner. Application of insecticides to manage the psyllid vectors and disease is the most common practice. Understanding the molecular interaction between CLas and ACP and interrupting the interrelationship can provide an alternative to insecticides for managing citrus greening disease. RESULTS Transcriptome analysis of ACP in response to CLas showed differential expression of 3911 genes (2196 upregulated, and 1715 down-regulated) including the key genes of ACP involved in cytoskeleton synthesis and nutrition-related proteins. Majority of the differentially expressed genes were categorized under molecular function followed by cellular components and biological processes. KEGG pathway analysis showed differential regulation of carbohydrate, nucleotide and energy metabolic pathways, the endocytotic pathway and the defense-related pathways. Differential regulation of genes associated with the key pathways might favors CLas to become systemic and propagate in its insect vector. CONCLUSION The study provides an understanding of genes involved in circulation of CLas in ACP. The candidate genes involved in key physiological processes and CLas transmission by ACP would be potential targets for sustainable management of ACP and CLas.

reaction mixture with 1x PCR buffer containing: 0.5 µM of each forward and reverse primers, 0.5 mM of dNTPs, 50 ng of template DNA and 1 unit of DyNAzyme II DNA polymerase (Thermo Fisher Scientific, MA, USA). The PCR conditions of 3 min of initial denaturation at 94°C, followed by 35 cycles of 30 s of denaturation at 94°C, 30 s of annealing at 52°C, 45 s of extension at 72°C, and a single final extension of 10 min at 72°C were used. PCR products were resolved on 1.6% agarose gel containing ethidium bromide and visualized in a UV illuminator.
Two sets of CLas-infected adult psyllids (designated as P+CLas1 and P+CLas2) and one set of healthy adult psyllid (designated as P-CLas) were subjected to total RNA isolation using TRIzol (Invitrogen) following manufacturer's protocol. RNA concentration and quality were assessed using standard procedures as recommended for Illumina (Illumina, San Diego, CA, USA) sequencing. Another set of samples (both P+CLas, P-CLas) obtained from the same batch were preserved in -80º C for qRT-PCR experiments.

Library preparation and sequencing
In brief, 500 ng of total RNA was used to enrich mRNA using NEB Magnetic mRNA Isolation Kit. The transcriptome library was prepared using NEB ultraII RNA library prep kit and sequenced using Illumina Next Seq 500 paired-end technology. The enriched mRNA was fragmented (approximately 200 bp) using a fragmentation buffer. Random hexamer primers were then added and hybridized to complementary RNA sequences. These short fragments were used as templates to synthesize the first-strand cDNA using reverse transcriptase and dNTPs. The DNA-RNA hybrids synthesized during first-strand cDNA synthesis were converted into full-length double-stranded cDNA using RNase H and E. coli DNA polymerase I and then second-strand cDNA was synthesized using second strand enzyme mix and buffer. The double-stranded cDNA fragments were purified using 1.8X Ampure beads. The purified double-stranded cDNA was end-repaired to ensure that each molecule was free of overhangs and had 5′ phosphates and 3′ hydroxyls before the adaptor ligation. The adaptor-ligated DNA was purified using Ampure beads and was enriched using specific primers, compatible with sequencing on to the Illumina platforms. The final enriched library was purified and quantified by Qubit and the size was analyzed by Bioanalyzer.

Pre-processing of raw reads and differential gene expression analysis
The quality of raw reads was visualized using FASTQC v0.11.2. Raw reads were pre-processed by removing the adaptor sequences, duplicated sequences, ambiguous and low-quality reads to obtain high-quality reads. After pre-processing, the quality of reads was ensured again using FASTQC. Pre-processed reads were mapped to the reference genome of D. citri (GCF_000475195.1) using TopHat v2.1.0 software with default parameters except for the mismatch parameter that was set to two. Gene expression was quantified using Subread v1.5.0 software with default parameters based on the Fragments Per Kilobase of transcript sequence per Millions base pairssequenced (FPKM values). Average FPKM value of expressed genes from two P+CLas samples (designated as combined P+CLas) was compared with FPKM value of expressed genes of P-CLas sample for identification of differentially expresses genes (DEGs) using edgeR v1.24.0 software (without biological replicates). Genes having a normalized P-value <0.05 were only considered for differential expression analysis. Genes with FPKM values >1 and <-1 were regarded as up and down-regulated, respectively. GO enrichment analysis of DEGs was performed (corrected p-value <0.05) using a web-based tool AgriGO. 7  DEGs was performed to identify the pathways that were differentially regulated between combined P+CLas and P-CLas samples using the software KOBAS v3.0 with corrected p-value <0.05.

Quantitative real time-PCR (qRT-PCR) analysis
To validate a few most differentially regulated genes (vitellogenin-1, vitellogenin-2, troponin C and tropomyosin-1) of ACP, the top two genes each from up-and down-regulated were selected. QRT-PCR analysis was performed in Insta Q 48m (Himedia) using the Maxima SYBR Green/ROX qPCR Master Mix (Thermo Fisher Scientific). αtubulin was used as a reference for normalization of gene expression. 8  reaction mixture consisted of 2 µl of cDNA, 0.5 µl of 10 µM forward and reverse primers each, 12.5 µl of SYBR Green master mix and 14.5 µl of nuclease-free water. Following thermal cycling was followed, initial denaturation at 94º C for 3 min, then 30 cycles of 94 º C for 30s, 62 º C for 30s and 72 º C for 30s followed by a melting stage. Each reaction including a non-template control was replicated thrice. The relative gene expression was calculated using the 2 -ΔΔCT method. 9 Statistical significance of the qRT-PCR data was analyzed using Tukey's multiple comparison tests. Values with p < 0.05 were considered to be statistically significant.

CLas infection in ACP and citrus plant
PCR with CLas-infected citrus plant and ACP yielded corresponding amplification of 809 bp in agarose gel electrophoresis. No amplification was observed in the case of CLas-free citrus plant and ACP exposed to healthy citrus plants.

Pre-processing of raw data and mapping of clean reads to reference genome
Raw reads obtained from three libraries (P+CLas1, 2 and P-CLas) ranged from 63.96 to 77.18 million with an average of 68.82 million. 98.93% to 99.21% of the total raw reads in three libraries qualified as clean reads after pre-processing steps. The proportion of clean reads that mapped to ACP genome ranged from 62.8% to 48.4% across libraries with an average of 58.7%. Among the mapped reads, 39.2% to 47.5% of the total reads were uniquely mapped while 9.2% to 15.3% of the total reads were mapped to multiple loci in the genome ( Table 2).

Global gene expression analysis
Genes having FPKM values greater than or equal to 1 were considered to be expressed. A total of 17,437 and 17,059 expressed genes were found in P+CLas1 and P+CLas2 samples, respectively while 15,444 genes were expressed in P-CLas. In DEG analysis using edgeR with BH correction, a heatmap of significantly expressed genes was generated. It was observed that the P+CLas samples clustered together as their gene expression patterns were nearly similar. However, both the samples differed from P-CLas indicating their differential gene expression pattern from the later (Fig. 1).

Differential gene expression analysis
In total, 3911 genes (27%) were found to be differentially expressed between combined P+CLas and P-CLas samples while the expression of 73% of the genes remained unaltered ( Fig. 2A). Among the DEGs, 2196 and 1715 genes were up-regulated and down-regulated respectively in combined P+CLas as compared to P-CLas ( Fig. 2B). The top 20 differentially expressed (up and down-regulated) genes in CLas-infected ACP have been listed in Table 3. Nutrition-related genes of ACP like vitellogenin-1, -2, and -3, and extensin were found highly overexpressed post exposure to CLas. Significant downregulation of cytoskeleton-related genes of ACP such as myosin heavy chain, troponin C, tropomyosin-1, E3 ubiquitin, and flightin was also recorded. In addition, few genes possibly involved in CLas-ACP interaction were differentially expressed such as laminin, papilin, integrin, talin, phenoloxidase, serine proteases, hemocytin, cathepsin B, and ABC transporters.

Functional analysis of DEGs
Gene Ontology (GO) terms were assigned to the 3911 DEGs. Based on the GO terms, DEGs were broadly grouped into three categories viz. genes associated with cellular components, biological processes, and molecular functions. The majority of the DEGs were categorized under molecular functions (50%) followed by cellular components (33%) and biological processes (17%) (Fig. 3A). In molecular function category, genes associated with catalytic activity, binding and nucleotide-binding were mostly enriched. Genes associated with cell, cell parts, membrane sub-categories were differentially enriched in cellular component category. In the biological processes, genes associated with metabolic, cellular and primary metabolic processes were the most enriched (Fig. 3B). Fig. 4 shows the highly enriched molecular functions in CLas-infected ACP.

KEGG pathway analysis of DEGs
Based on the KEGG pathway analysis, the DEGs were involved in metabolism, environmental and nucleotide signal processing, organismal systems and cellular processes. The major metabolic pathways affected include carbohydrate, nucleotide, and energy metabolism while transcription, translation, folding, sorting and degradation were the differentially regulated nucleotide signal processing pathways. Genes involved in signal transduction and aging that aid in environmental signal processing and organismal systems respectively were also differentially regulated. Transport and catabolism related pathways remained the most affected cellular processes (Fig. 5). The results of KEGG pathway analysis were consistent with the pathways identified in GO enrichment analysis.

Validation of DEGs in qRT-PCR
Expression of vitellogenin-1 and vitellogenin-2 genes was 8.0 and 8.4-fold higher respectively in P+CLas sample as compared to P-CLas. However, troponin C and tropomyosin-1 genes were down-regulated to 6.0 and 5.4-fold respectively in P+CLas sample in comparison with P-CLas samples. Similarly, RNA-seq analysis showed that expression of vitellogenin-1 and vitellogenin-2 genes were up-regulated to 18.4 and 17.0-fold respectively while the troponin C and tropomyosin-1 genes were down-regulated to 5.8 and 5.0-fold respectively in CLas-infected ACP samples (Fig. 6). Melt curve analysis in qRT-PCR confirmed the specificity of the reactions.

Discussion
A host-pathogen-vector relationship involves complex direct and indirect interactions where pathogen and vector compete with each other for a shared host. They can affect each other by altering the biochemical traits of the host. A hypothesis of the manipulation of vectors by the pathogens has been proposed by Ingwell et al. 10 This could occur by modifying the physiological processes of the vector directly. 11,12 Both positive and negative effects of plant pathogens on their vectors have been recorded. [12][13][14] CLas is transmitted by ACP in a persistentpropagative manner. 4 Multiplication of CLas in ACP suggests a potential alteration in the physiological process of ACP. ACP nymphs acquire CLas from infected hosts efficiently than the adults and become viruliferous in late instars or after emergence as adults. ACP adults can also acquire the virus in 15-30 min feeding on infected host plants but an increase in titer of CLas was found when acquired by nymphs. 15 Once acquired, CLas crosses the mid-gut barrier and reaches hemolymph of ACP. It is thought to be transported to the salivary gland and accumulates there to infect healthy hosts during feeding and salivation. CLas also invades reproductive organ of ACP and carried to the next generation through eggs. 4 The molecular mechanism of interaction between ACP and CLas is still unknown. Transcriptomic studies have been used to understand the interaction between pathogen and vector at the molecular level. 15 The present study was performed to understand the differential gene expression pattern in ACP in response to CLas infection. Cluster analysis of expressed genes grouped the two sets of ACP samples infected with CLas and both these samples were different from the healthy ACP sample indicating the differential gene expression pattern between CLas-infected and CLas-free psyllids.
A group of vitellogenin genes of ACP was found upregulated post CLas exposure in the current study.
Vitellogenin proteins are the major egg yolk protein precursors in insects. 15 However, the expression of vitellogenin gene in honey bee males suggests their possible role other than yolk formation. 16  breeding. 23 In a recent study, Lan et al. 24 showed that rice dwarf virus (RDV) reduced the survival and fecundity of its vector, Nephotettix cincticeps by downregulating the expression of troponin C. Similarly, CLas infection of ACP was shown to downregulate the cytoskeletal proteins to induce changes in cytoskeletal configuration. 25 Downregulation of troponin C and tropomyosin-1 genes in CLas-infected ACP in the current study also suggests the possible cytoskeletal modification in cells of ACP to mediate the circulation of CLas within ACP.
In addition, flightin, a myosin-binding protein that facilitates thick filament assembly and muscle integrity, 26 also showed reduced expression in CLas-infected ACP. Taken  Among the major metabolic pathways of ACP affected by CLas, carbohydrate, energy and nucleotide metabolism were important. To meet out the energy and nucleotide requirements, CLas manipulates the respective pathways of the vector. 29 Considering the non-metabolic pathways, endocytotic pathway plays an inevitable role in host invasion by CLas. 3 Enrichment of genes associated with cellular endocytosis in ACP suggests possible manipulation of endocytotic pathway upon CLas infection.

Conclusion
In conclusion, a total of 3911 genes were differentially expressed including 2196 up-regulated and 1715 downregulated genes. Most DEGs included vitellogenins, extensin, cytoskeleton and endocytotic pathway-related genes. Molecular functions were highly enriched in CLas-infected ACP. Carbohydrate, energy, and nucleotide metabolism were the major metabolic pathways affected while endocytotic and defense-related pathways that facilitate the circulation and multiplication of CLas in ACP were also affected. The putative genes of ACP that are highly regulated in response to CLas can be targeted for the effective management of CLas once validated functionally.   The overall results of FPKM cluster analysis, clustered using the log2 (FPKM+1) value. Red denotes genes with high expression levels, and blue denotes genes with low expression levels. The color range from blue to red represents the log2 (FPKM+1) value from large to small.

Figure 4
Illustration of top GO DAG under molecular functions category. Each node represents a GO term, the darker the color is, the higher is the enrichment level of the term. The name and p-value of each term are present on the node.

Figure 6
Validation of the differential expression of selected genes related to insect cytoskeleton and nutrition using qRT-PCR.