DATA DESCRIPTOR | doi:10.20944/preprints202208.0349.v1
Online: 18 August 2022 (11:12:25 CEST)
The Peruvian creole cattle (PCC) is a neglected breed, and is an essential livestock resource in the Andean region of Peru. To develop a modern breeding program and conservation strategies for the PCC, a better understanding of the genetics of this breed is needed. We sequenced the whole genome of the PCC using a paired-end 150 strategy on the Illumina HiSeq 2500 platform, obtaining 320 GB of sequencing data. The obtained genome size of the PCC was 2.77 Gb with a contig N50 of 108Mb and 92.59% complete BUSCOs. Also, we identified 40.22% of repetitive DNA of the genome assembly, of which retroelements occupy 32.39% of the total genome. A total of 19,803 protein-coding genes were annotated in the PCC genome. We downloaded proteomes and genomes of the Bovinae subfamily, and conducted a comparative analysis with our draft genome. Phylogenomic analysis showed that PCC is related to Bos indicus. Also, we identified 7,746 family genes shared among the Bovinae subfamily. This first PCC genome is expected to contribute to a better understanding of its genetics to adapt to the tough conditions of the Andean ecosystem, and evolution.