Subject: Biology And Life Sciences, Anatomy And Physiology Keywords: SARS-CoV-2; Phylogenetics; Asia
Online: 15 January 2021 (13:14:15 CET)
Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) as the current coronavirus pandemic is an infectious disease that initially confirmed in China in late December 2019. In this study, we analyzed 131 complete sequences of SARS-CoV-2 from Asia. Our results show that there are fifteen major mutations in Asia which most of them are co-evolved. There were five groups based on co-mutations which three of them resulted in clade G including (241C>T, 3037C>T, 14408C>T, and 23403A>G), (28881G>A, 28882G>A, 28883G>C and 23403A>G) and (25563G>T and 23403A>G). Co-mutations in (8782C>T and 28144T>C) and (1397G>A, 28688T>C, 29742G>T and 11083G>T) were clustered in clade S and a new clade outside of GISAID classification, respectively. Sequences with a mutation in 26144G>T had low variability without any co-mutation which formed clade V. In this study, we showed that Most of the circulated viruses in Asia collected in five co-mutation groups which may affect the transmissibility and vaccine designing strategies.
ARTICLE | doi:10.20944/preprints202009.0327.v1
Subject: Biology And Life Sciences, Virology Keywords: Ghana; SARS-CoV-2; transmission; Phylogenetics
Online: 15 September 2020 (04:24:17 CEST)
In regions lacking genomic data, analysis of sequences from the early stages of an outbreak can provide important insights into the diversity of pathogens present. Following the detection of the first imported case of COVID-19 in the Northern sector of Ghana on 13th March 2020, we have now molecularly characterized and phylogenetically analysed sequences including three (3) complete genomes of the severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2) isolated from nine (9) patients observed in Ghana. Eight (8) of these patients reported with a recent history of foreign travel and one (1) with no history of foreign travel. We performed high throughput sequencing for 9 samples following the determination of high concentration of viral RNA. In addition, we estimated the potential impact that long distance transportation of samples to testing centres may have on sequencing outcomes. Here, two samples that were closest in terms of viral RNA concentration but transported from sites which are over 400km apart were assessed. All sequences were compared to previous sequences from Ghana and representative sequences from regions where our patients had previously travelled. Complete genomes were obtained for three (3) sequences and with another near complete genome with a coverage of 95.6%. Sequences with coverage in excess of 80% were found to belong to three lineages namely A, B.1 and B.2. Our sequences clustered in two different clades with the majority falling within a clade composed of sequences from sub-Saharan Africa. Less RNA fragmentation was seen in sample KATH23 which was collected 9km compared with sample TTH6 which was collected and transported over a distance of 400km to the testing site. The clustering of several sequences from sub-Saharan Africa suggests regional circulation of the viruses in the subregion. Importantly, there may be the need to decentralize testing sites and build more capacity across Africa to boost the sequencing output of the subregion.
ARTICLE | doi:10.20944/preprints202111.0426.v1
Subject: Biology And Life Sciences, Immunology And Microbiology Keywords: Bacillus anthracis; anthrax; outbreak; phylogenetics; detection assay
Online: 23 November 2021 (14:44:33 CET)
The zoonotic disease anthrax caused by the endospore-forming bacterium Bacillus anthracis is very rare in Germany. In the state of Bavaria, the last case occurred in July of 2009 resulting in four dead cows. In August of 2021, the disease reemerged after heavy rains, killing one gestating cow. Notably, both outbreaks affected the same pasture, suggesting a close epidemiological connection. B. anthracis could be grown from blood culture and the presence of both virulence plasmids (pXO1 and pXO2) were confirmed by PCR. Also, recently developed diagnostic tools enabled rapid detection of B. anthracis cells and nucleic acids directly in clinical samples. The complete genome of the strain isolated from blood, designated BF-5, was DNA-sequenced and phylogenetically grouped within the B.Br.CNEVA clade that is typical for European B. anthracis strains. The genome was almost identical to BF-1, the isolate of 2009, separated only by three single nucleotide polymorphisms on the chromosome, one on plasmid pXO2 and three indel-regions. Further, B. anthracis DNA was detected by PCR from soil-samples taken from spots, where the cow had fallen onto the pasture. New tools based on phage receptor binding proteins enabled the microscopic detection and isolation of B. anthracis directly from soil-samples. These environmental isolates were genotyped and found to be SNP-identical to BF-1. Therefore, it seems that the BF-5 genotype is currently the prevalent one at the affected premises. The contaminated area was subsequently disinfected with formaldehyde.
ARTICLE | doi:10.20944/preprints202010.0470.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: Gastroenteritis; rotavirus; G3[P8]; phylogenetics; equine-like
Online: 22 October 2020 (22:51:07 CEST)
Globally, rotavirus group A (RVA) remains a major cause of severe childhood diarrhea, despite the use of vaccines in > 100 countries. RVA sequencing for local outbreaks facilitates investigation into strain composition, origins, spread, and vaccine failure. In 2018, we collected 248 stool samples from children aged <13 years admitted with diarrheal illness to Kilifi County Hospital, coastal Kenya. Antigen screening detected RVA in 55 samples (22.2%). Of these, VP7 (G) and VP4 (P) segments were successfully sequenced in 48 (87.3%) and phylogenetic analysis based on the VP7 sequences identified seven genetic clusters with six different GP combinations; G3P, G1P, G2P, G2P, G9P and G12P. The G3P strains predominated the season (n=37, 67.2%) and comprised three G3 genetic clusters that fell within Lineage I and IX (the latter also known as equine-like G3 lineage). Both two G3 lineages have been recently detected in several countries. Our study is the first to document African children infection with G3 lineage IX. These data highlight the global nature of RVA transmission and the importance of increasing global rotavirus vaccine coverage.
ARTICLE | doi:10.20944/preprints202009.0487.v1
Subject: Biology And Life Sciences, Virology Keywords: SARS-CoV-2; COVID-19; Phylogenetics; mortality
Online: 21 September 2020 (03:35:15 CEST)
The age-related mortality and morbidity risk of COVID-19 has been considered speculative without enough scientific evidence. This study aimed to collect more evidence on the association between patient age and risk of severe disease state and/or mortality from SARS-CoV-2 infection. Genomic dataset along with metadata (3608 samples) retrieved from GISAID from different geographical regions were grouped into 10 age groups (0-10, 11-20, 21-30, 31-40, 41-50, 51-60, 61-70, 71-80, 81-90, 91-100 years) as well as high-risk or low-risk according to patient clinical status. Genomic sequences were aligned and analyzed using MAFFT and FASTTREE to build a phylogenetic tree in order to identify age-risk associations based on phylogenetic clustering. Case fatality rates (CFR), as well as the Odds ratio (OR) for high-risk outcomes, were calculated for different age groups. Results revealed that individuals aged between 25-50 years have the best immune response to the infection. On the other hand, disease fatality was higher in patients aging above 50 years. We created an application to calculate the OR of being at high risk given a certain age threshold from GISAID datasets. OR values increased between ages 1-10 years (1.271) and 11-20 years (1.313) but reduced at age range 21-30 years (1.290) and increased again for 61-70 years (2.465). CFR calculated for each of the age groups had peak values at 90-100 years (26.8%) and the lowest at 0-10 years (0%). The CFR for ages above 50 years was about twice greater (11.6%-26.8%) than that for ages below (0-6.6%). The phylogenetic analysis revealed that the majority of samples obtained from India showed low-risk among different age groups and were defined as clade GH. Another cluster from Singapore visualization showed unfavorable patient outcome across several age groups and were classified under clade O. To conclude, this study analyses showed a variety of age-risk associations. As scientists from different countries upload more genomes to globally shared databases, more evidence will reinforce mortality risk associations in COVID-19 patients.
ARTICLE | doi:10.20944/preprints202009.0005.v1
Subject: Biology And Life Sciences, Virology Keywords: mink astrovirus; molecular diagnostics; molecular polymorphism; phylogenetics
Online: 1 September 2020 (11:18:15 CEST)
Mink astrovirus infection remains a poorly understood disease entity, and the aetiological agent itself causes disease with a heterogeneous course, including gastrointestinal and neurological symptoms. This paper presents cases of astrovirus infection in mink from continental Europe. RNA was isolated from the brains and intestines of animals showing symptoms typical of shaking mink syndrome (n = 6). RT-PCR was used to detect astrovirus genetic material, and the reaction products were separated on a 1% agarose gel. The specificity of the reaction was confirmed by sequencing all samples. The presence of astrovirus RNA was detected in each of the samples tested. Sequencing and bioinformatic analysis indicated the presence of the same variant of the virus in all samples. Comparison of the variant with the sequences available in bioinformatics databases confirmed that the Polish isolates form a separate clade, closely related to Danish isolates. The similarity of the Polish variant to those isolated in other countries ranged from 2.4% (in relation to Danish isolates) to 7.1% (in relation to Canadian isolates). Phylogenetic relationships between variants appear to be associated with the geographic distances between them. To our knowledge, this work describes the first results on the molecular epidemiology of MAstV in continental Europe. The detection of MAstV in Central Europe indicates the need for further research to broaden our understanding of the molecular epidemiology of MAstV in Europe.
ARTICLE | doi:10.20944/preprints202310.1211.v1
Subject: Computer Science And Mathematics, Software Keywords: software integration; middleware; data centric workflows; computational phylogenetics
Online: 19 October 2023 (04:47:55 CEST)
Epidemiological surveillance and phylogenetic studies rely nowadays on processing and analysing huge volumes of data. Processing tasks consist on running and refining a series of intertwined computational tasks. And, despite of existing several web applications for data processing and interactive visualization for phylogenetic studies, integrating many different tools and algorithms, their execution is total or partially on the client side, making them unsuitable for dealing with huge volumes of data. Studies are often also not easy to reproduce. On the other hand, in recent years, data-centric workflow systems have been proposed, allowing to deal better with increasingly larger datasets. The integration of these systems within phylogenetic tools will allow to scale them as required, and will contribute also to promote studies reproducibility. We propose then the FLOWViZ middleware for facilitating the integration of a state of the art data-centric workflow system, Apache Airflow, within web applications for phylogenetic analyses. This framework abstracts contracts and a core API for defining tools and workflows, where tools are assumed to be containerized. FLOWViZ has been tested and evaluated within the PHYLOViZ web application, a tool supporting phylogenetic inference and data visualization.
COMMUNICATION | doi:10.20944/preprints202304.0343.v1
Subject: Biology And Life Sciences, Virology Keywords: Zika; arboviruses; vector-borne infections; genomic surveillance; phylogenetics
Online: 14 April 2023 (03:51:13 CEST)
The Americas, particularly Brazil, were greatly impacted by the widespread outbreak of Zika virus (ZIKV) in 2015 and 2016. Efforts were made to implement genomic surveillance of ZIKV as part of the public health responses. The accuracy of spatiotemporal reconstructions of the epidemic spread relies on the unbiased sampling of the transmission process. In the early stages of the outbreak, we recruited patients exhibiting clinical symptoms of arbovirus-like infection from Salvador and Campo Formoso, Bahia, in Northeast Brazil. Between May 2015 and June 2016, we identified 21 cases of acute ZIKV infection and subsequently recovered 14 near full-length sequences using the amplicon tiling multiplex approach with nanopore sequencing. We perform a time-calibrated discrete phylogeographic analysis to trace the spread and migration history of the ZIKV. Our phylogenetic analysis supports a consistent relationship between ZIKV migration from Northeast to Southeast Brazil and its subsequent dissemination beyond Brazil. Additionally, our analysis provides insights into the migration of ZIKV from Brazil to Haiti and the role Brazil played in the spread of ZIKV to other countries, such as Singapore, the USA and Dominican Republic. The data generated by this study enhances our understanding of ZIKV dynamics and supports the existing knowledge, which can aid in future surveillance efforts against the virus.
ARTICLE | doi:10.20944/preprints202210.0187.v1
Subject: Biology And Life Sciences, Animal Science, Veterinary Science And Zoology Keywords: Sorex araneus complex; karyotype; introgression; phylogenetics; hybridization; Iberia
Online: 13 October 2022 (07:10:47 CEST)
Mitochondrial introgression raises questions of biogeography, and extent of reproductive isolation and natural selection. Previous phylogenetic work on the Sorex araneus complex revealed apparent mitonuclear discordance in Iberian shrews indicating past hybridization of S. granarius and the Carlit chromosomal race of S. araneus, enabling introgression of the S. araneus mitochondrial genome into S. granarius. To further study this, we genetically typed 61 S. araneus/coronatus/granarius from localities in Portugal, Spain, France and Andorra at mitochondrial, autosomal and sex-linked loci, and combined our data with the previously published sequences. Our data are consistent with the earlier data that S. coronatus and S. granarius are the most closely related of the three species and confirm that S. granarius from the Central System mountain range in Spain captured the mitochondrial genome from a population of S. araneus. The mitochondrial capture event can be explained by invoking a biogeographical scenario whereby S. araneus was in contact with S. granarius during the Younger Dryas in central Iberia despite the two species currently having disjunct distributions. We discuss whether selection favoured S. granarius with an introgressed mitochondrial genome. Our data also suggest recent hybridization and introgression between S. coronatus and S. granarius and between S. araneus and S. coronatus.
Subject: Biology And Life Sciences, Anatomy And Physiology Keywords: machine learning; deep learning; bioinformatics; phylogenetics; cancer evolution
Online: 17 February 2021 (09:40:45 CET)
The exponential growth of biomedical data in recent years urged the application of numerous machine learning techniques to address emerging problems in biology and clinical research. By enabling automatic feature extraction, selection and generation of predictive models, these methods can be used to efficiently study complex biological systems. Machine learning techniques are frequently integrated with bioinformatic methods, as well as curated databases and biological networks, to enhance training and validation, identify the best interpretable features, and enable feature and model investigation. Here, we review recently developed methods that incorporate machine learning within the same framework with techniques from molecular evolution, protein structure analysis, systems biology and disease genomics. We outline the challenges posed for machine learning, and in particular, deep learning in biomedicine and suggest unique opportunities for machine learning techniques integrated with established bioinformatics approaches to overcome some of these challenges.
ARTICLE | doi:10.20944/preprints201910.0086.v1
Subject: Biology And Life Sciences, Life Sciences Keywords: phylogenomics; phylogenetics; codon usage bias; Tree of Life
Online: 8 October 2019 (10:43:24 CEST)
Phylogenies depict shared evolutionary patterns and structures on a tree topology, enabling the identification of hierarchical and historical relationships. Recent analyses indicate that phylogenetic signals extend beyond the primary structure of protein or DNA, and various aspects of codon usage biases are phylogenetically conserved. Several functional biases exist within genes, including the number of codons that are used, the position of the codons, and the overall nucleotide composition of the genome. Codon usage biases can significantly affect transcription and translational efficiencies, leading to differential gene expression. Although systematic codon usage biases originate from the overall GC content of a species, ramp sequences, codon aversion, codon pairing, and tRNA competition also significantly affect gene expression and are phylogenetically conserved. We review recent advances in analyzing codon usage biases and their implications in phylogenomics. We first outline common phylogenomic techniques. Next, we identify several codon usage biases and their effects on secondary structure, gene expression, and implications in phylogenetics. Finally, we suggest how codon usage biases can be included in phylogenomics. By incorporating various codon usage biases in common phylogenomic algorithms, we propose that we can significantly improve tree inference. Since codon usage biases have significant biological implications, they should be considered in conjunction with other phylogenetic algorithms.
ARTICLE | doi:10.20944/preprints201808.0318.v1
Subject: Biology And Life Sciences, Virology Keywords: respiratory syncytial virus; phylogenetics; evolution; multi-year persistence
Online: 18 August 2018 (05:14:42 CEST)
There is an ongoing global pandemic of human respiratory syncytial virus (RSV) infection that results in substantial annual morbidity and mortality. In Australia, RSV is the major cause of acute lower respiratory tract infections (ALRI). Nevertheless, little is known about the extent and origins of genetic diversity of RSV in Australia, nor the factors that shape this diversity. We conducted a genome-scale analysis of RSV infections in New South Wales (NSW). RSV genomes were successfully sequenced for 144 specimens collected between 2010-2016. Of these, 64 belonged to the RSVA and 80 to the RSVB subtype. Phylogenetic analysis revealed a wide diversity of RSV lineages within NSW and that both subtypes evolved rapidly in a strongly clock-like manner, with mean rates of approximately 6-8 x 10-4 nucleotide substitutions per site per year. There was only weak evidence for geographic clustering of sequences, indicative of fluid patterns of transmission within the infected population, and no evidence of any clustering by patient age such that viruses in the same lineages circulate through the entire host population. Importantly, we show that both subtypes circulated concurrently in NSW with multiple introductions into the Australian population in each year, and only limited evidence for multi-year persistence.
TECHNICAL NOTE | doi:10.20944/preprints201804.0047.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: BLAST; DNA, open source; phylogenetics; R; sequence orthology.
Online: 4 April 2018 (06:00:40 CEST)
The exceptional increase in molecular DNA sequence data in open repositories is mirrored by an ever-growing interest among evolutionary biologists to harvest and use those data for phylogenetic inference. Many quality issues, however, are known and the sheer amount and complexity of data available can pose considerable barriers to their usefulness. A key issue in this domain is the high frequency of sequence mislabelling encountered when searching for suitable sequences for phylogenetic analysis. These issues include the incorrect identification of sequenced species, non-standardised and ambiguous sequence annotation, and the inadvertent addition of paralogous sequences by users, among others. Taken together, these issues likely add considerable noise, error or bias to phylogenetic inference, a risk that is likely to increase with the size of phylogenies or the molecular datasets used to generate them. Here we present a software package, phylotaR, that bypasses the above issues by using instead an alignment search tool to identify orthologous sequences. Our package builds on the framework of its predecessor, PhyLoTa, by providing a modular pipeline for identifying overlapping sequence clusters using up-to-date GenBank data and providing new features, improvements and tools. We demonstrate our pipeline’s effectiveness by presenting trees generated from phylotaR clusters for two large taxonomic clades: palms and primates. Given the versatility of this package, we hope that it will become a standard tool for any research aiming to use GenBank data for phylogenetic analysis.
Subject: Biology And Life Sciences, Virology Keywords: Phylogenetics; Subclinical infection; FMD outbreaks; Disease control; Surveillance; Sentinels
Online: 30 August 2021 (11:54:39 CEST)
The genetic diversity of foot-and-mouth disease virus (FMDV) poses a challenge to the successful control of the disease, and it is important to identify the emergence of different strains in endemic settings. The objective of this study was to evaluate sampling of clinically healthy livestock at slaughterhouses as a strategy for genomic FMDV surveillance. Serum samples (n = 11875) and oropharyngeal fluid (OPF) samples (n = 5045) were collected from asymptomatic cattle and buffalo on farms in eight provinces in southern and northern Vietnam (2015 to 2019) to characterize viral diversity. Outbreak sequences were collected between 2009 and 2019. In two slaughterhouses in southern Vietnam, 1200 serum and OPF samples were collected from asymptomatic cattle and buffalo (2017 to 2019) as a pilot study on the use of slaughterhouses as sentinel points of surveil-lance. VP1 sequences were analyzed using discriminant principal component analysis and time-scaled phylodynamic trees. Six of seven serotype O and A clusters circulating in southern Vietnam from 2017-19 were detected at least once in slaughterhouses, sometimes pre-dating outbreak sequences associated with the same cluster by 4-6 months. Routine sampling at slaughterhouses may provide timely and cost-effective strategy for genomic surveillance to identify circulating and emerging FMDV strains.
CONCEPT PAPER | doi:10.20944/preprints202106.0578.v1
Subject: Biology And Life Sciences, Animal Science, Veterinary Science And Zoology Keywords: homology; developmental mechanism; evidential integration; eumetazoan body plan; phylogenetics
Online: 23 June 2021 (11:45:06 CEST)
Reconstructing ancestral species is a challenging endeavour: fossils are often scarce or enigmatic, and inferring ancestral characters based on novel molecular approaches (e.g. comparative genomics or developmental genetics) has long been controversial. A key philosophical challenge pertinent at present is the lack of a theoretical framework capable of evaluating inferences of homology made through integration of multiple kinds of evidence (e.g. molecular, developmental, or morphological). Here, I present just such a framework. I start with a brief history and critical assessment of attempts at inferring morphological homology through developmental genetics. I then bring attention to a recent model of homology, namely Character Identity Mechanisms (DiFrisco, Love, & Wagner, 2020), intended partly to elucidate the relationships between morphological characters, developmental genetics, and homology. I utilise and build on this model to construct the evaluative framework mentioned above, which judges the epistemic value of evidence of each kind in each particular case based on three proposed criteria: effectiveness, admissibility, and informativity, as well as providing a generalised guideline on how it can be scientifically operationalised. I then point out the evolution of the eumetazoan body plan as a case in point where the application of this framework can yield satisfactory results, both empirically and conceptually. I will conclude with a discussion on some potential implications for more general philosophy of biology and philosophy of science, especially surrounding evidential integration, models and explanation, and reductionism.
ARTICLE | doi:10.20944/preprints202310.1689.v1
Subject: Biology And Life Sciences, Biology And Biotechnology Keywords: marine yeast; microbial phylogenetics; bioactivity; antioxidant; biological control; tyrosinase inhibition
Online: 26 October 2023 (11:10:54 CEST)
Marine yeasts have versatile applications in industrial, medical and environmental fields, but have received little attention compared to terrestrial yeasts and filamentous fungi. In this study, a phylogenetic analysis of 11 marine-derived yeasts was conducted using internal transcribed spacers and nuclear large subunit rDNA, and their bioactivities, such as antioxidant, antibacterial, and tyrosinase inhibition activities, were investigated. The 11 marine-derived yeasts were identified to belong to seven species including Geotrichum candidum, Metschnikowia bicuspidata, Papiliotrema fonsecae, Rhodotorula mucilaginosa, Vishniacozyma carnescens, Yamadazyma olivae, and Yarrowia lipolytica, and three strains of these were candidates for new species of the genera Aureobasidium, Rhodotorula, and Vishniacozyma. Most extracts showed antioxidant activity, whereas seven strains exhibited antibacterial activities against Bacillus subtilis. Only Aureobasidium sp. US-Sd3 among the 11 isolates showed tyrosinase inhibition. Metschnikowia bicuspidata BP-Up1 and Yamadazyma olivae K2-6 showed notable radical-scavenging activity, which has not been reported previously. Among the isolates, Aureobasidium sp. US-Sd3 exhibited the highest antibacterial and tyrosinase inhibitory activities. Overall, our results demonstrate the potential of marine-derived yeasts as a source of bioactive compounds for improving industrial applications.
ARTICLE | doi:10.20944/preprints202303.0496.v1
Subject: Biology And Life Sciences, Biology And Biotechnology Keywords: Grey slender loris; Mysore slender loris; Malabar slender loris; Phylogenetics
Online: 29 March 2023 (02:37:35 CEST)
Phylogenetics is a powerful tool for understanding the evolutionary history of organisms and for informing conservation and management of species. Among the strepsirrhine primates, the slender lorises are a threatened genus of small, nocturnal animals confined to India and Sri Lanka. The grey slender loris (Loris lydekkerianus) is divided into several subspecies based on the morphological and geographical variation but not supported by molecular data. We investigated the genetic basis of taxonomic and biogeographic variation as well as the phylogenetic divergence of two subspecies of the grey slender loris in southern India: the Mysore slender loris (Loris lydekkerianus ssp. lydekkerianus) and the Malabar slender loris (Loris lydekkerianus ssp. malabaricus). We sequenced and assembled the whole mitochondrial genomes of three representative individuals from their distribution in southern India and compared them with publicly available mitogenomes of other lorises. We found that the two Indian subspecies vary by 2.09% in the COX1 and CYTB gene regions and form distinct monophyletic clades that diverged about 1.049 million years ago. Our results support the morphological classification of these two subspecies in southern India and have implications for their conservation and management in captivity and in the wild.
ARTICLE | doi:10.20944/preprints202204.0050.v1
Subject: Biology And Life Sciences, Virology Keywords: porcine circovirus; PCV2; domestic pig; wild boar; subtype; phylogenetics; MinION; Ukraine
Online: 7 April 2022 (03:03:31 CEST)
Porcine circovirus type 2 (PCV2) is responsible for a number of porcine circovirus associated diseases (PCAD) that can severely impact domestic pig herds. For a non-enveloped virus with a small genome (1.7 kb ssDNA), PCV2 is remarkably diverse, with 8 subtypes (a-h). New subtypes of PCV2 can spread through migration of wild boars, which are thought to infect domestic pigs and spread further through the domestic pig trade. Despite a large swine population, the diversity of PCV2 subtypes in Ukraine has been undersampled, with few PCV2 genome sequences reported in the past decade. To gain a deeper understanding of PCV2 subtype diversity in Ukraine, samples of blood serum were collected from wild boars (n = 107) that were hunted in Ukraine during the November-December 2012 hunting season. We found 34/107 (31.8%) prevalence of PCV2 by diagnostic PCR. For domestic pigs, liver samples (n = 16) were collected from a commercial market near Kharkiv in 2019, of which 6/16 (37%) were positive for PCV2. We sequenced the genotyping locus ORF2, a gene encoding the PCV2 viral capsid (Cp), for 11 wild boar and 6 domestic pig samples in Ukraine using an Oxford Nanopore MinION device. Of 17 samples with resolved subtypes, PCV2 subtype b was most common in wild boar (10/11, 91%), while domestic pigs were infected with subtypes b and d. We also detected subtype b/d and b/a co-infections in wild boar and domestic pigs, respectively, and subtype f in a wild boar from Poltava for the first time in Ukraine. Building a maximum likelihood phylogeny, we identified a sublineage of PCV2 subtype b infections in both wild and domestic swine, suggesting a possible epizootic cluster and ecological interaction in northeastern Ukraine.
ARTICLE | doi:10.20944/preprints202107.0572.v1
Subject: Biology And Life Sciences, Anatomy And Physiology Keywords: CFPHV; ChHV5; phylogenetics; phylogenomics; viral evolution and diversity; marine turtles; fibropapillomatosis
Online: 26 July 2021 (11:58:37 CEST)
The spreading global sea turtle fibropapillomatosis (FP) epizootic is threatening some of Earth’s ancient reptiles, adding to the plethora of threats faced by these keystone species. Understanding this neoplastic disease, and its likely aetiological pathogen, chelonid alphaherpesvirus 5 (ChHV5), is crucial to understand how the disease impacts sea turtle populations and species and the future trajectory of disease incidence. We generated 20 ChHV5 genomes, from three sea turtle species, to better understand the viral variant diversity and gene evolution of this oncogenic virus. We revealed previously underappreciated genetic diversity within this virus (with an average of 2,035 single nucleotide polymorphisms [SNPs], 1.54% of the ChHV5 genome) and identified genes under the strongest evolutionary pressure. Furthermore, we investigated the phylogeny of ChHV5 at both genome and gene level, confirming the propensity of the virus to be interspecific with related variants able to infect multiple sea turtle species. Finally, we revealed unexpected intra-host diversity, with up to 0.15% of the viral genome varying between ChHV5 genomes isolated from different tumours concurrently arising within the same individual. These findings offer important insights into ChHV5 biology and provide genomic resources for this oncogenic virus.
ARTICLE | doi:10.20944/preprints201812.0026.v2
Subject: Biology And Life Sciences, Virology Keywords: Lactobacillus plantarum; phage; new genus; annotation; comparative genomics; phylogenetics; isolation; diversity
Online: 11 June 2019 (09:54:23 CEST)
Lactobacillus plantarum is a bacterium with promising applications to the food industry and agriculture and probiotic properties. So far, bacteriophages of this bacterium have been moderately addressed. We examined the diversity of five new L. plantarum phages via whole genome shotgun sequencing and in silico protein predictions. Moreover, we looked into their phylogeny and their potential genomic similarities to other complete phage genome records through extensive nucleotide and protein comparisons. These analyses revealed a high degree of similarity among the five phages, which extended to the vast majority of predicted virion-associated proteins. Based on these, we selected one of the phages as a representative and performed transmission electron microscopy and structural protein sequencing tests. Overall, the results suggested that the five phages belong to the family Myoviridae, they have a long genome of 137.973-141.344 bp, a G/C content of 36,3-36,6% that is quite distinct from their host’s, and, surprisingly, seven to 15 tRNAs. Only an average 41/174 of their predicted genes were assigned a function. The comparative analyses unraveled considerable genetic diversity for the five L. plantarum phages of this study. Hence, the new genus “Semelevirus” was proposed, which comprises exclusively the five phages. This novel lineage of Lactobacillus phages provides further insight into the genetic heterogeneity of phages infecting Lactobacillus sp.. The five new Lactobacillus phages have a potential value for the development of more robust starters through, for example, the selection of mutants insensitive to phage infections. The five phages could also form part of phage cocktails, which producers would apply in different stages of L. plantarum fermentations in order to create a range of organoleptic outputs.
BRIEF REPORT | doi:10.20944/preprints202305.1931.v1
Subject: Biology And Life Sciences, Biochemistry And Molecular Biology Keywords: 16S rRNA gene; phylogenetics; amplicon metagenomics; bacterial species; gene structure and sequence
Online: 26 May 2023 (11:38:01 CEST)
Bacterial phylogenetics has largely been determined via 16S rRNA gene sequencing and phylogenetic tree reconstruction. Observed utility of this approach has driven the popularity of the 16S rRNA gene amplicon metagenomics method for profiling and identifying diverse microbes from specific habitats. This work sought to develop universal primers for amplifying the 16S rRNA gene from a consortium of disparate microbial species. Using multiple sequence alignment of the 16S rRNA gene of a variety of microbes, the resulting highly conserved region of the consensus sequence was used for design of universal polymerase chain reaction (PCR) primers for 16S rRNA gene. Application of the universal primers in simulated PCR reveals poor amplification efficiency where only 12 species out of 31 generated an amplicon. BLAST analysis of the resulting amplicons reveals a classification error of 50%. More significantly, analysis of the amplicon length indicates variable read length ranging from 81 to 122 base pair compared to the predicted read length of 100 base pairs. This suggests that the 16S rRNA gene harbours significant hitherto underappreciated sequence diversity, and may have unknown alternative splicing and recombination mechanisms. Overall, results from this study suggests that primer design for 16S rRNA amplicon metagenomics may be application and habitat specific, where it is difficult to design universal primers for all bacterial species. Conceptually, this meant that there may be sequence co-evolution in 16S rRNA gene for microbial species in the habitat where environmental and nutritional conditions impact on 16S rRNA gene structure and sequence. In essence, 16S rRNA gene may habour epigenetics signals at the gene level.
CONCEPT PAPER | doi:10.20944/preprints202105.0767.v1
Subject: Biology And Life Sciences, Animal Science, Veterinary Science And Zoology Keywords: evidential integration; causal explanation; early animal evolution; phylogenetics; macroevolution; evolutionary scenario; cross-disciplinary research
Online: 31 May 2021 (12:25:44 CEST)
Molecular methods have revolutionised virtually every area of biology, and metazoan phylogenetics is no exception: molecular phylogenies, molecular clocks, comparative phylogenomics, and developmental genetics have collectively transformed our understanding of the evolutionary history of animals. Moreover, the diversity of methods and models within molecular phylogenetics has resulted in significant disagreement among molecular phylogenies as well as between these and traditional phylogenies. Here, I argue that tackling this multifaceted problem lies in integrating evidence to infer the best evolutionary scenario. I begin with an overview of recent developments in early metazoan phylogenetics, followed by a discussion of key conceptual issues in phylogenetics revolving around phylogenetic evidence and theory. I then argue that integration of different kinds of evidence is necessary for arriving at the best evolutionary scenario rather than the best-fitting cladogram. Finally, I discuss the prospects of this view in stimulating interdisciplinary cross-talk in early metazoan research and beyond.
ARTICLE | doi:10.20944/preprints202003.0211.v1
Subject: Public Health And Healthcare, Nursing Keywords: research program; taxonomic theory; phenetics; rational systematics; numerical systematics; typology; biosystematics; biomorphics; phylogenetics; evo-devo
Online: 12 March 2020 (10:18:15 CET)
Biological diversity (BD) explored by the biological systematics is a complexly organized natural phenomenon and can be partitioned in several aspects defined with references to various causal factors structuring biota. These BD aspects are studied by particular research programs based on specific taxonomic theories (TT). They provide in total a framework for comprehending the structure of the biological systematics and its multi-aspect relations to other fields of biology. General principles of individualizing BD aspects and construing TT as quasi-axiomatics are briefly considered. It stressed that each TT is characterized by a specific combination of interrelated ontological and epistemological premises most adequate to the BD aspect a TT deals with. The following contemporary research programs in systematics are recognized and characterized in brief: phenetic, rational (with several subprograms), numerical, typological (with several subprograms), biosystematic, biomorphic, phylogenetic (with several subprograms), evo-devo. From a scientific pluralism perspective, all these research programs related to particular naturally defined BD aspects are of the same biological and scientific significance and no one of them can pretend to take a privileged position. They elaborate “locally” natural classifications that can be united by a kind of generalized faceted classification.
REVIEW | doi:10.20944/preprints202309.0905.v1
Subject: Biology And Life Sciences, Ecology, Evolution, Behavior And Systematics Keywords: phylogenetics; hybridization; introgression; horizontal gene transfer; lateral gene transfer; phylogenetic incongruence; gene-tree-species-tree discordance
Online: 14 September 2023 (04:50:27 CEST)
Phylogenomics has enriched our understanding of the Tree of Life. Non-vertical modes of evolution—such as hybridization/introgression and horizontal gene transfer—deviate from a strictly bifurcating tree model, mirroring a network-like or reticulate structure. Here, we present an overview of a phylogenomic workflow for inferring organismal histories, calibrating those histories to evolutionary time, and detecting reticulate evolution. Mitigating analytical sources of error facilitates accurate reconstructions of evolutionary history and, in turn, characterization of non-vertical modes of evolution. Workflows and methods discussed herein may aid in the rigorous inference of organismal histories in geologic time and reticulation, providing a clearer understanding of the evolutionary process.
ARTICLE | doi:10.20944/preprints202002.0249.v1
Subject: Biology And Life Sciences, Agricultural Science And Agronomy Keywords: Fungal diversity; Saccharomyces; genetic diversity; glyphosate-based herbicides; copper-based fungicides; RoundUp Ready™ corn; phylogenetics
Online: 17 February 2020 (15:37:11 CET)
Saccharomyces cerevisiae are a phenotypically diverse species that adapt to a wide variety of environments by exploiting standing genetic diversity and selecting for advantageous mutations. Glyphosate and copper-based herbicides/ fungicides affect non-target organisms, these incidental exposures can impact microbial populations. In this study, glyphosate resistance was found in the historical collection of yeast which was collected over the last century, but only in yeast isolated after the introduction of glyphosate. The highest glyphosate-resistant yeasts were isolated from agricultural sites. However, herbicide application at these sites was not recorded. In an effort to assess glyphosate resistance and impact on non-target microorganisms, yeast were harvested from 15 areas with known herbicidal histories, including an organic farm, conventional farm, remediated coal mine, suburban locations, state park, and a national forest. Yeast representing 23 genera were isolated from 237 samples of plant, soil, spontaneous fermentation, nut, flower, fruit, feces, and tree material samples. Saccharomyces, Candida, Metschnikowia, Klyveromyces, Hanseniaspora, and Pichia were other genera commonly found across our sampled environments. Managed areas had less species diversity and at the brewery, only Saccharomyces and Pichia were isolated. A conventional farm growing RoundUp Ready™ corn had the lowest phylogenetic diversity and the highest glyphosate resistance. The mine was sprayed with multiple herbicides including a commercial formulation of glyphosate; however, the yeast did not have elevated glyphosate resistance. In contrast to the conventional farm, the mine was exposed to glyphosate only one year prior to sample isolation. Glyphosate resistance is an example of the anthropogenic selection of nontarget organisms.
ARTICLE | doi:10.20944/preprints201806.0050.v1
Subject: Biology And Life Sciences, Virology Keywords: cophylogeny; granulovirus; host shifts; Lepidoptera; mPTP; multitrophic interactions; niche conservatism; nucleopolyhedrovirus; phylogenetics; resource tracking; species delimitation
Online: 5 June 2018 (06:32:44 CEST)
The Baculoviridae, a family of insect-specific large DNA viruses, is widely used in both biotechnology and biological control. Its applied value stems from millions of years of evolution influenced by interactions with their hosts and the environment. To understand how ecological interactions, have shaped baculovirus diversification, we reconstructed a robust molecular phylogeny using 217 complete genomes and ~580 isolates for which at least one of four lepidopteran core genes was available. We then used a phylogenetic-concept-based approach (mPTP) to delimit 165 baculovirus species, including 38 species derived from new genetic data. Phylogenetic optimization of ecological characters revealed a general pattern of host conservatism punctuated by occasional shifts between closely related hosts and major shifts between lepidopteran superfamilies. Moreover, we found significant phylogenetic conservatism between baculoviruses and the type of plant growth (woody or herbaceous) associated with their insect hosts. In addition, we found that colonization of new ecological niches sometimes led to viral radiation. These macroevolutionary patterns show that besides selection during the infection process, baculovirus diversification was influenced by tritrophic interactions, explained by their persistence on plants and interactions in the midgut during horizontal transmission. This complete eco-evolutionary framework highlights the potential innovations that could still be harnessed from the diversity of baculoviruses.
ARTICLE | doi:10.20944/preprints202206.0322.v1
Subject: Biology And Life Sciences, Virology Keywords: Viral Ecology; APMV; wild birds; Surveillance of Avian Paramyxoviruses; phylogenetics; MinION; Azov-Black Sea region in Ukraine
Online: 23 June 2022 (09:33:18 CEST)
Emerging RNA virus infections are a growing concern among domestic bird and poultry industries due to the severe impact it can have on the flock health and economic livelihoods. Avian paramyxoviruses (APMV) are pathogenic, negative sense RNA viruses that cause serious infections in the respiratory and central nervous system. APMV was detected in multiple avian species during the 2017 migration season in Ukraine, and studied using PCR, virus isolation, and sequencing. Of the 4090 wild bird samples, eleven swabs were isolated in chicken embryos and identified for APMV serotype by hemagglutinin inhibition test: APMV-1, APMV-4, APMV-6, APMV-7. At a variety of sites in Ukraine we characterized the virulence of the virus and further analyzed and predicted the potential risks of spillover to immunologically naïve populations. RNA was extracted and amplified using a multiplex-tiling primer approach to encompass full cDNA genomes. Full-length APMV-1 (n=5) and APMV-6 (n=2) genomes were sequenced on an Oxford Nanopore MinION device in Ukraine. All APMV-1 and APMV-6 fusion (F) proteins possessed a monobasic cleavage site, suggesting these APMV were likely low virulence, annually circulating strains. Utilization of this low-cost method will identify gaps in viral evolution and circulation in this understudied but important critical region for Eurasia.
ARTICLE | doi:10.20944/preprints202208.0338.v1
Subject: Biology And Life Sciences, Virology Keywords: migratory birds; Newcastle disease virus-GVII; poultry; phylogenetics; sequence-independent; sin-gle-primer amplification (SISPA); velogenic; whole genome sequencing (WGS)
Online: 18 August 2022 (10:40:10 CEST)
Newcastle disease virus (NDV) genotype VII is a highly pathogenic Orthoavulavirus that has caused multiple outbreaks among poultry in Egypt since 2011. This study aimed to investigate the genetic diversity of NDV prevailing in domestic and wild birds in Egyptian governorates. A total of 37 oropharyngeal swabs from wild birds and 101 swabs from domestic bird flocks including chickens, ducks, turkeys, and swans were collected from different geographic regions within 13 governorates during 2019-2020. Virus isolation and propagation via embryonated eggs revealed 91 swab samples produced allantoic fluid containing hemagglutination activity, suggestive of virus presence. The use of RT-PCR targeted to F gene successfully detected NDV in 85 samples. The geographical prevalence of NDV spread to 12 governorates in domestic birds, migratory and non-migratory wild birds. Following whole genome sequencing, we assembled six NDV genome sequences (70 - 99% of genome coverage), including five full F gene sequences. All NDV strains carried high virulence, based on the presence of polybasic amino acids (RRQRF) at the F gene cleavage site. Phylogenetic analysis revealed that the NDV strains belonged to class II within genotype VII.1.1. The presence of genetically similar virulent NDV in wild birds further highlights their role in the dissemination of NDV in poultry populations across Egypt. Continued genomic surveillance in both wild birds and poultry would be necessary for monitoring NDV incursions and genetic diversification.