Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Wastewater Based Epidemiology, Phylogenetic Analysis and Machine Learning Approach to Describe the Evolution of SARS-CoV-2 in the South-East of Spain

Version 1 : Received: 3 May 2023 / Approved: 5 May 2023 / Online: 5 May 2023 (03:31:35 CEST)

A peer-reviewed article of this Preprint also exists.

Férez, J.A.; Cuevas-Ferrando, E.; Ayala-San Nicolás, M.; Simón Andreu, P.J.; López, R.; Truchado, P.; Sánchez, G.; Allende, A. Wastewater-Based Epidemiology to Describe the Evolution of SARS-CoV-2 in the South-East of Spain, and Application of Phylogenetic Analysis and a Machine Learning Approach. Viruses 2023, 15, 1499. Férez, J.A.; Cuevas-Ferrando, E.; Ayala-San Nicolás, M.; Simón Andreu, P.J.; López, R.; Truchado, P.; Sánchez, G.; Allende, A. Wastewater-Based Epidemiology to Describe the Evolution of SARS-CoV-2 in the South-East of Spain, and Application of Phylogenetic Analysis and a Machine Learning Approach. Viruses 2023, 15, 1499.

Abstract

The COVID-19 pandemic has posed a significant global threat, leading to several initiatives for its control and management. One such initiative involves wastewater-based epidemiology, which has gained attention for its potential to provide early warning of virus outbreaks and real-time information on its spread. In this study, water samples from two wastewater treatment plants (WWTPs) located at the south east of Spain (Region of Murcia) namely Murcia, and Cartagena, were analyzed by RT-qPCR, Phylogenetic Analysis, and Machine Learning Approach. The aim was to determine whether SARS-CoV-2 detection in the WWTPs of these two cities could serve as a proxy for the virus's spread in the population. The results confirmed that the levels of SARS-CoV-2 in these wastewater samples changed concerning the number of SARS-CoV-2 cases detected in the population and variant occurrences were in line with clinical reported data. Additionally, the phylogenetic analysis showed that samples obtained in close sampling times exhibited a higher similarity than those obtained more distantly in time. A second analysis using a machine learning approach based on the mutations found in the SARS-CoV-2 spike protein was also conducted. Hierarchical Clustering (HC) was used as an efficient unsupervised approach for data analysis. Results indicated that samples obtained in October 2022 in Murcia and Cartagena were significantly different, which corresponded well with the different virus variants circulating in the two locations. The proposed methods in this study are adequate for comparing the Accumulated Natural Vector (ANV) of the SARS-CoV-2 sequences as a preliminary evaluation of potential changes in the variants that are circulating in a given population at a specific time point.

Keywords

SARS-CoV-2; Epidemiology; Wastewater-based Epidemiology; Phylogenetic Analysis; Machine Learning Approach; Molecular virology

Subject

Biology and Life Sciences, Virology

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.