DATA DESCRIPTOR | doi:10.20944/preprints202208.0349.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: NGS; Andean; neglected breed; genome
Online: 18 August 2022 (11:12:25 CEST)
The Peruvian creole cattle (PCC) is a neglected breed, and is an essential livestock resource in the Andean region of Peru. To develop a modern breeding program and conservation strategies for the PCC, a better understanding of the genetics of this breed is needed. We sequenced the whole genome of the PCC using a paired-end 150 strategy on the Illumina HiSeq 2500 platform, obtaining 320 GB of sequencing data. The obtained genome size of the PCC was 2.77 Gb with a contig N50 of 108Mb and 92.59% complete BUSCOs. Also, we identified 40.22% of repetitive DNA of the genome assembly, of which retroelements occupy 32.39% of the total genome. A total of 19,803 protein-coding genes were annotated in the PCC genome. We downloaded proteomes and genomes of the Bovinae subfamily, and conducted a comparative analysis with our draft genome. Phylogenomic analysis showed that PCC is related to Bos indicus. Also, we identified 7,746 family genes shared among the Bovinae subfamily. This first PCC genome is expected to contribute to a better understanding of its genetics to adapt to the tough conditions of the Andean ecosystem, and evolution.
ARTICLE | doi:10.20944/preprints202205.0225.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: chloroplast; genome; sweet cucumber; Solanaceae; next-generation sequencing
Online: 17 May 2022 (08:38:03 CEST)
Sweet cucumber (Solanum muricatum) sect. Basarthrum, is a neglected horticultural crop native of the Andean region. It is naturally distributed very close to potatoes (Solanum sect. Petota) and tomatoes (Solanum sect. Lycopersicon), two groups of high economic importance. To date, molecular tools for this crop are still undetermined. We here obtained the first complete chloroplast (cp) genome of sweet cucumber and compared with seven Solanaceae species. Pair-end clean reads were obtained by PE 150 library and the Illumina HiSeq 2500 platform. The complete cp genome of S. muricatum had a 155,681 bp with typical quadripartite structure, containing a large single copy (LSC) region (86,182 bp) and a small single-copy (SSC) region (18,360 bp), separated by two inverted repeat (IR) regions (25,568 bp). The annotation of chloroplast genome predicted 88 protein-coding genes (CDS), 8 ribosomal RNA (rRNA) genes, 37 transfer RNA (tRNA) genes, and one pseudogene. A total of 48 perfect microsatellites were identified, divided in mononucleotide repeats (32), followed by tetranucleotide (6) and dinucleotides (5). SSRs with trinucleotides repeats (3), pentanucleotide (1) and hexanucleotide (1) repeats motifs in these genomes were identified in lower quantity. Most of these repeats were distributed in the noncoding regions. Whole chloroplast genome comparison with the other seven Solanaceae species revealed that the small and large single copy regions showed more divergence than inverted regions. Finally, phylogenetic analysis resolved that S. muricatum is a sister species to members of sections Petota + Lycopersicum + Etuberosum. This study reports for the first time the genome organization, gene content, and structural features of the cp genome of S. muricatum. Also, this study may provide the basis for evaluating genetic diversity within Solanum, and will be useful to examine the evolutionary processes in sweet cucumber landraces.
ARTICLE | doi:10.20944/preprints202203.0224.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: zoogenetic resources; organelle; genomics; NGS; cattle; Bos taurus
Online: 16 March 2022 (07:36:01 CET)
Cattle spread throughout the American continent during the colonization years, originating creole breeds that adapted to a wide range of climate conditions. Population of creole cattle in Peru is decreasing mainly due to the introduction of more productive breeds in recent years. During the last 15 years, there have been a significant progress on cattle genomics. However, little is known about the genetics of the Peruvian creole cattle (PCC) even though its importance to (i) improve productivity in the Andean region, (ii) agricultural labor, and (iii) cultural traditions. In addition, the origin and phylogenetic relationship of the PCC is still unclear. In order to promote the conservation of the PCC, we sequenced for the first time the mitochondrial genome of a creole bull from the highlands of Arequipa, which also possessed exceptional fighting skills and was employed for agricultural tasks. The total mitochondrial genome sequence is 16,339 bp in length with the base composition of 31.43 % for A, 28.64 % for T, 26.81 % for C, and 13.12 % for G. It contains 13 protein-coding genes, two ribosomal RNA genes, 22 transfer RNA genes and a control region. Among the 37 genes, 28 were positioned on the H-strand and nine were positioned on the L-strand. The most frequently used codons were CUA (Leucine), AUA (Isoleucine), AUU (Isoleucine), AUC (Isoleucine), y ACA (Threonine). Maximum likelihood reconstruction using complete mitochondrial genome sequences clearly demonstrated that the PCC is strongly related to native African breeds, giving insights into the ancestry of PCC. The annotated mitochondrial genome of PCC would serve as an important genetic data set for further breeding work and conservation strategies.
ARTICLE | doi:10.20944/preprints202111.0533.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: chloroplast; genetic resources; genomics capirona; phylogenomics
Online: 29 November 2021 (12:32:24 CET)
Capirona (Calycophyllum spruceanum Benth.) belongs to subfamily Ixoroideae, one of de major lineages in the Rubiaceae family, and is an important timber tree, with origin in the Amazon Basin and has widespread distribution in Bolivia, Peru, Colombia, and Brazil. In this study, we obtained the first complete chloroplast (cp) genome of capirona from department of Madre de Dios located in the Peruvian Amazon. High-quality genomic DNA was used to construct librar-ies. Pair-end clean reads were obtained by PE 150 library and the Illumina HiSeq 2500 platform. The complete cp genome of C. spruceanum has a 154,480 bp in length with typical quadripartite structure, containing a large single copy (LSC) region (84,813 bp) and a small single-copy (SSC) region (18,101 bp), separated by two inverted repeat (IR) regions (25,783 bp). The annotation of C. spruceanum cp genome predicted 87 protein-coding genes (CDS), 8 ribosomal RNA (rRNA) genes, 37 transfer RNA (tRNA) genes and 01 pseudogene. A total of 41 simple sequence repeats (SSR) of this cp genome were divided into mononucleotides (29), dinucleotides (5), trinucleotides (3), and tetranucleotide (4). Most of these repeats were distributed in the noncoding regions. Whole chloroplast genome comparison with the other six Ixoroideae species revealed that the small single copy and large single copy regions showed more divergence than invert regions. Finally, phylogenetic analysis resolved that C. spruceanum is a sister species to Emmenopterys henryi, and confirms its position within the subfamily Ixoroideae. This study reports for the first time the genome organization, gene content, and structural features of the chloroplast genome of C. spruceanum, providing valuable information for genetic and evolutionary studies in the genus Calycophyllum and beyond.