Estrada, R.; Corredor, F.-A.; Figueroa, D.; Salazar, W.; Quilcate, C.; Vásquez, H.V.; Maicelo, J.L.; Gonzales, J.; Arbizu, C.I. Reference-Guided Draft Genome Assembly, Annotation and SSR Mining Data of the Peruvian Creole Cattle (Bos taurus). Data2022, 7, 155.
Estrada, R.; Corredor, F.-A.; Figueroa, D.; Salazar, W.; Quilcate, C.; Vásquez, H.V.; Maicelo, J.L.; Gonzales, J.; Arbizu, C.I. Reference-Guided Draft Genome Assembly, Annotation and SSR Mining Data of the Peruvian Creole Cattle (Bos taurus). Data 2022, 7, 155.
Estrada, R.; Corredor, F.-A.; Figueroa, D.; Salazar, W.; Quilcate, C.; Vásquez, H.V.; Maicelo, J.L.; Gonzales, J.; Arbizu, C.I. Reference-Guided Draft Genome Assembly, Annotation and SSR Mining Data of the Peruvian Creole Cattle (Bos taurus). Data2022, 7, 155.
Estrada, R.; Corredor, F.-A.; Figueroa, D.; Salazar, W.; Quilcate, C.; Vásquez, H.V.; Maicelo, J.L.; Gonzales, J.; Arbizu, C.I. Reference-Guided Draft Genome Assembly, Annotation and SSR Mining Data of the Peruvian Creole Cattle (Bos taurus). Data 2022, 7, 155.
Abstract
The Peruvian creole cattle (PCC) is a neglected breed, and is an essential livestock resource in the Andean region of Peru. To develop a modern breeding program and conservation strategies for the PCC, a better understanding of the genetics of this breed is needed. We sequenced the whole genome of the PCC using a paired-end 150 strategy on the Illumina HiSeq 2500 platform, obtaining 320 GB of sequencing data. The obtained genome size of the PCC was 2.77 Gb with a contig N50 of 108Mb and 92.59% complete BUSCOs. Also, we identified 40.22% of repetitive DNA of the genome assembly, of which retroelements occupy 32.39% of the total genome. A total of 19,803 protein-coding genes were annotated in the PCC genome. We downloaded proteomes and genomes of the Bovinae subfamily, and conducted a comparative analysis with our draft genome. Phylogenomic analysis showed that PCC is related to Bos indicus. Also, we identified 7,746 family genes shared among the Bovinae subfamily. This first PCC genome is expected to contribute to a better understanding of its genetics to adapt to the tough conditions of the Andean ecosystem, and evolution.
Keywords
NGS; Andean; neglected breed; genome
Subject
Biology and Life Sciences, Biochemistry and Molecular Biology
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.