Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

HPV DeepSeq: Ultra-Fast Method of NGS Data Analysis and Visualization Using Automated Workflows and a Customized Papillomavirus Database in CLC Genomics Workbench

Version 1 : Received: 4 August 2021 / Approved: 6 August 2021 / Online: 6 August 2021 (08:00:00 CEST)

A peer-reviewed article of this Preprint also exists.

Shen-Gunther, J.; Xia, Q.; Cai, H.; Wang, Y. HPV DeepSeq: An Ultra-Fast Method of NGS Data Analysis and Visualization Using Automated Workflows and a Customized Papillomavirus Database in CLC Genomics Workbench. Pathogens 2021, 10, 1026. Shen-Gunther, J.; Xia, Q.; Cai, H.; Wang, Y. HPV DeepSeq: An Ultra-Fast Method of NGS Data Analysis and Visualization Using Automated Workflows and a Customized Papillomavirus Database in CLC Genomics Workbench. Pathogens 2021, 10, 1026.

Journal reference: Pathogens 2021, 10, 1026
DOI: 10.3390/pathogens10081026

Abstract

Next-generation sequencing (NGS) has actualized human papillomavirus (HPV) virome profiling for in-depth investigation of viral evolution and pathogenesis. However, viral computational analysis remains a bottleneck due to semantic discrepancies between computational tools and curated reference genomes. To address this, we developed and tested automated workflows for HPV taxonomic profiling and visualization using a customized Papillomavirus database in CLC Microbial Genomics Module. HPV genomes from Papilloma Virus Episteme were customized and incorporated into CLC “ready-to-use” workflows for stepwise data processing to include: 1) Taxonomic Analysis, 2) Estimate Alpha/Beta Diversities, and 3) Map Reads to Reference. Low-grade (n = 95) and high-grade (n = 60) Pap smears were tested with ensuing collective runtimes: Taxonomic Analysis (36 min); Alpha/Beta Diversities (5 sec); Map Reads (45 min). Tabular output conversion to visualizations entailed 1-2 keystrokes. Biodiversity analysis between low- (LSIL) and high-grade squamous intraepithelial lesions (HSIL) revealed loss of species richness and gain of dominance by HPV-16 in HSIL. Integrating clinically relevant, taxonomized HPV reference genomes within automated workflows proved to be an ultra-fast method of virome profiling. The entire process named “HPV DeepSeq” provides a simple, accurate and practical means of NGS data analysis for a broad range of applications in viral research.

Keywords

Bioinformatics; Cervical cancer; Deep Sequencing; Human papillomavirus; HPV genotyping; Metagenome; Next generation sequencing; Taxonomic classification; Virome

Subject

LIFE SCIENCES, Biochemistry

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our diversity statement.

Leave a public comment
Send a private comment to the author(s)
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.

We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.