ARTICLE | doi:10.20944/preprints202107.0562.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: genome; transcriptome; gene models; Leishmania; Illumina sequencing; PacBio sequencing; expression levels; untranslated regions (UTRs); SL-additions sites; polyadenylation sites
Online: 26 July 2021 (10:23:40 CEST)
Leishmania major is the main causative agent of cutaneous leishmaniasis in humans. The Friedlin strain of this species (LmjF) was chosen when a multi-laboratory consortium undertook the objective of deciphering the first genome sequence for a parasite of the genus Leishmania. The objective was successfully attained in 2005, and this represented a milestone for Leishmania molecular biology studies around the world. Although the LmjF genome sequence was done following a shotgun strategy and using classical Sanger sequencing, the results were excellent and this genome assembly served as the reference for subsequent genome assemblies in other Leishmania species. Here, we present a new assembly for the genome of this strain (named LMJFC for clarity), generated by the combination of two high throughput sequencing platforms, Illumina short-read sequencing and PacBio Single Molecular Real-Time (SMRT) sequencing, which provides long-read sequences. Apart from resolving uncertain nucleotide positions, several genomic regions have been reorganized and a more precise composition of tandemly repeated gene loci was attained. Additionally, the genome annotation has been improved by adding 542 genes and more accurate coding-sequences defined for around two hundred genes, based on the transcriptome delimitation also carried out in this work. As a result, we are providing gene models (including untranslated regions and introns) for 11,238 genes. Genomic information ultimately determines the biology of every organism; therefore, our understanding of molecular mechanisms will depend on the availability of precise genome sequences and accurate gene annotations. In this regards, this work is providing an improved genome sequence and updated transcriptome annotations for the reference L. major Friedlin strain.
ARTICLE | doi:10.20944/preprints202008.0281.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: Leishmania infantum; proteome; post-translational modifications (PTMs); proteogenomics; mass spectrometry
Online: 12 August 2020 (10:10:45 CEST)
Leishmania infantum is causative of visceral leishmaniasis (kala-azar), the most severe form of leishmaniasis, lethal if untreated. Few years ago, re-sequencing and de novo assembling of the L. infantum (strain JPCM5) genome was accomplished, and now we aimed to describe and characterize the experimental proteome of this species. In this work, we have performed a proteomic analysis from axenic cultured promastigotes and carried out a detailed comparison with other Leishmania experimental proteomes published to date. We identified 2,352 proteins based on the search of mass spectrometry data against a database built from the six-frame translated genome sequence of L. infantum. We have detected many proteins belonging to organelles such as glycosomes, mitochondria or flagellum, as well as many metabolic enzymes, and a large number of putative RNA binding proteins and molecular chaperones. Moreover, we listed the proteins presenting post-translational modifications, such as phosphorylations, acetylations and methylations, among others. On the other hand, the identification of peptides mapping to genomic regions previously annotated as non-coding has allowed to correct annotations, leading to N-terminal extension of protein sequences, and the uncovering of eight novel protein-coding genes. The alliance of proteomics, genomics and transcriptomics has resulted in a powerful combination for improving the L. infantum reference genome annotation.