Preprint
Brief Report

This version is not peer-reviewed.

Detection of Pulmonary Toxoplasma gondii Causing Atypical Pneumonia in an Immunosuppressed Patient through Metagenomic Sequencing

Submitted:

21 June 2025

Posted:

23 June 2025

You are already at the latest version

Abstract
Background Pulmonary toxoplasmosis is a rare but critical opportunistic infection in immunosuppressed individuals, often presenting as atypical pneumonia. Conventional diagnostic methods, including serology and cultures, may yield false negatives, complicating timely diagnosis and treatment. Methods: A bronchoalveolar lavage sample was obtained from a 48-year-old immunosuppressed patient hospitalized with atypical pneumonia. The sample underwent next-generation sequencing (NGS)-based metagenomic analysis, including nucleic acid extraction, library preparation, high-throughput sequencing, and bioinformatic pipeline processing for microbial identification. Results: NGS identified Toxoplasma gondii DNA in the bronchoalveolar lavage sample, confirming a diagnosis of pulmonary toxoplasmosis. Remarkably, the patient responded clinically to metronidazole monotherapy, achieving significant symptom resolution. Follow-up computed tomography revealed complete radiological improvement, confirming clinical cure. Discussion: This case underscores the value of metagenomic sequencing in diagnosing atypical pathogens where conventional approaches fail. The identification of Toxoplasma gondii highlights the expanded utility of NGS in pulmonary infections, particularly in immunosuppressed patients. The unexpected success of metronidazole as a sole treatment raises questions about its potential efficacy against Toxoplasma gondii, warranting further investigation. Conclusion: Metagenomic diagnostics represent a powerful tool in identifying rare infectious agents and guiding effective interventions. This case highlights the clinical significance of NGS in improving diagnostic precision and broadening therapeutic horizons for pulmonary toxoplasmosis.
Keywords: 
;  ;  ;  ;  

1. Introduction

Pulmonary toxoplasmosis, caused by the protozoan Toxoplasma gondii, is a rare but severe condition primarily affecting immunosuppressed individuals. Globally, T. gondii infects nearly one-third of the population, yet pulmonary involvement remains underreported due to its atypical presentation and diagnostic challenges. Immunocompromised patients, such as those undergoing chemotherapy, organ transplantation, or living with advanced HIV/AIDS, are at heightened risk. The clinical presentation often mimics other pulmonary conditions, with symptoms like fever, dyspnea, and respiratory distress complicating timely diagnosis [1].
Traditional diagnostic methods, including serology and culture, are limited in detecting pulmonary toxoplasmosis. Serological tests may yield false negatives in immunosuppressed patients due to impaired antibody production, while T. gondii is notoriously difficult to culture from clinical specimens [2]. Radiological findings, though valuable, are often non-specific and require integration with other diagnostic tools. These limitations underscore the need for advanced diagnostic approaches.
Next-generation sequencing (NGS) has emerged as a transformative tool in infectious disease diagnostics. Metagenomic NGS enables unbiased detection of microbial DNA directly from clinical samples, by passing the need for prior pathogen suspicion [3,4]. This technology has proven particularly effective in identifying rare and atypical infections, including pulmonary toxoplasmosis, where conventional methods fail [4,5,6]. Moreover, NGS provides comprehensive insights into co-infections and antimicrobial resistance, enhancing clinical decision-making [5].
This study demonstrates the integration of metagenomic NGS with clinical expertise in infectious diseases, microbiology, and radiology to diagnose pulmonary toxoplasmosis. By integrating advanced molecular diagnostics with comprehensive clinical evaluation, we present a novel approach to enhance diagnostic accuracy and patient outcomes. These findings support the broader implementation of Next-Generation Sequencing (NGS) as a standard diagnostic tool for complex infectious diseases.

2. Materials and Methods

Sample Collection and Clinical Context
The bronchoalveolar lavage (BAL) fluid sample was collected from an 83-year-old immunosuppressed patient with clinical stage IV breast cancer presenting with atypical pneumonia at the Hospital Oncológico Solón Espinosa Ayala (SOLCA) in Quito, Ecuador. The patient exhibited severe respiratory distress, fever, and dyspnea. Serological and culture assays for bacterial, fungal, and viral pathogens yielded negative results. Upon the request of the Infectious Diseases practitioner at SOLCA, the sample was transferred to the Metagenomics Diagnostic Department at Universidad Internacional SEK (UISEK) for advanced diagnostic evaluation. No further information on the patient's clinical status was received, and no access to details of the clinical information is available.
Nucleic Acid Extraction and Quality Assessment
The BAL fluid sample was processed in a biosafety level 2 laboratory under aseptic conditions. DNA was extracted using the QIAamp DNA Mini Kit (Qiagen, Germany) following the manufacturer’s standard protocol and optimized for clinical samples. Extracted DNA was quantified using a Qubit 4 Fluorometer (Thermo Fisher Scientific, USA), ensured the sample's suitability for downstream analysis.
Next-Generation Sequencing
The extracted DNA was sent to Macrogen, a certified sequencing service provider, for next-generation sequencing (NGS) using the Illumina NovaSeq 6000 platform. Library preparation was conducted using the Nextera XT DNA Library Prep Kit (Illumina, USA), and paired end reads of 2 x 151 base pairs were generated. Sequencing achieved a depth of 89.1 million reads, surpassing the thresholds recommended for comprehensive microbial profiling.
Bioinformatics Workflow
Metagenomic analysis was conducted using the Galaxy Australia platform [7], employing a standardized pipeline validated in previous studies for pathogen identification. The workflow included the following steps:
  • Quality Control: Raw reads were assessed using FastQC (Version 0.74) and trimmed with Trim Galore (Version 0.6.7) to remove low-quality bases and adapter sequences [8].
  • Taxonomic Classification: High-confidence microbial reads were classified using Kraken2 (Version 2.1.3) against both the Minikraken2 database (optimized for rapid pathogen detection) [9] and the EuPathDB resource [10], which provides comprehensive reference genomes and annotations for eukaryotic parasites.
  • Abundance Estimation: Bracken (Version 3.1) refined taxonomic classifications and quantified species-level abundances [11].
  • Visualization: Krona charts were generated for interactive representation of the microbial composition [12].
Contamination Control and Validation Negative controls processed alongside the sample confirmed the absence of contamination. Additionally, the pipeline's reliability was validated using publicly available datasets (National Institutes of Health) containing known Toxoplasma gondii sequences, demonstrating consistent pathogen detection and abundance [13].

3. Results

Sample Processing and Sequencing Metrics
The BAL fluid sample processed with the Qiagen DNA extraction method yielded high-quality nucleic acids, with a DNA concentration of 48 ng/µL. Sequencing generated 44 570 809 reads per end, with over 99% achieving a Phred score >30, ensuring exceptional data quality.
Pathogen Identification: A total of 44,530,522 paired-end reads were generated, of which 99.67% mapped to the human genome, leaving 141,589 non–host reads for downstream analysis (Table 1).
Taxonomic breakdown of the non–host fraction revealed that eukaryotic sequences comprised 47.79% of these reads, with Toxoplasma gondii accounting for 32.43% (45,912 reads) (Table 2). Bacterial sequences represented 52.18%, while viral and archaeal reads were negligible (<0.02% each).
Metagenomic analysis of bronchoalveolar lavage identified several parasite sequences, including fragments associated with Sarcocystis neurona, Vitrella brassicaformis, and Chromera velia (Table 4). However, after assessing abundance, consistency of the sequencing profile, and its correlation with clinical presentation, only Toxoplasma gondii showed a robust and sustained signal, aligned with its recognized pathogenic potential in humans, especially in immunocompromised individuals (Table A1). The other findings, probably reflecting environmental contamination or bioinformatic crossmatches, lacked clinical support and were considered incidentally present.
Pipeline Validation and Specificity Analysis of negative controls processed concurrently confirmed the absence of Toxoplasma gondii or other significant pathogens, ruling out contamination. Additionally, publicly available datasets containing T. gondii were re-analyzed, demonstrating consistent results and validating the reliability of the bioinformatics workflow.
Clinical Correlation and Radiological Findings
Identifying T. gondii aligned with the patient’s clinical presentation and immunosuppressed status. Treatment with metronidazole monotherapy resulted in rapid clinical improvement, with symptoms resolving within two weeks. Follow-up computed tomography (CT) scans revealed the complete resolution of pulmonary infiltrates, supporting the microbiological diagnosis and treatment efficacy.

4. Discussion

This study highlights the transformative potential of metagenomic next-generation sequencing (NGS) in diagnosing complex infections, such as pulmonary toxoplasmosis, particularly in immunosuppressed patients. Conventional diagnostic methods, including serology and culture, often fail in cases of rare or atypical presentations, as demonstrated in this case. The inability of traditional tools to detect Toxoplasma gondii underscores the critical need for advanced diagnostic approaches, such as NGS [14,15].
The bioinformatics workflow employed in this study ensured high specificity and sensitivity. Human genome reads were subtracted using Bowtie2 alignment against the GRCh38 reference genome, eliminating host contamination and enabling focused microbial profiling [16]. Taxonomic classification with Kraken2 and species-level abundance estimation using Bracken confirmed Toxoplasma gondii as the causative agent, with no significant co-infections detected. These findings align with previous studies demonstrating the robustness of metagenomic pipelines in identifying pathogens in clinically ambiguous cases [14,15,16].
The clinical correlation of metagenomic findings further strengthens the diagnostic conclusion. The patient’s rapid recovery following metronidazole monotherapy, coupled with radiological resolution of pulmonary infiltrates, provides evidence for the causative role of Toxoplasma gondii [17]. While metronidazole is not a conventional treatment for toxoplasmosis, its apparent efficacy in this case raises intriguing questions about its potential therapeutic role, warranting further investigation [17,18,19].
Despite its advantages, the implementation of metagenomics in routine diagnostics faces challenges [14,15,16]. Cost and accessibility remain significant barriers, particularly in resource-limited settings. Additionally, the interpretation of metagenomic data requires specialized expertise in bioinformatics and infectious diseases, which may limit its widespread adoption [20]. However, the diagnostic and clinical benefits of NGS, as demonstrated in this study, far outweigh these limitations, particularly in cases where conventional methods fail to provide actionable results [5,6,15,16].
This study also underscores the importance of a multidisciplinary approach. Collaboration between clinical microbiologists, infectious disease specialists, and radiologists ensured that metagenomic findings were effectively contextualized within the patient’s clinical scenario. This integrated model of care serves as a benchmark for using advanced molecular tools in complex diagnostic cases [16].

5. Conclusions

In conclusion, metagenomic NGS represents a paradigm shift in infectious disease diagnostics, offering rapid, accurate, and comprehensive microbial identification. By bridging critical diagnostic gaps, NGS enables timely and targeted interventions, improving patient outcomes. The findings from this study advocate for the broader adoption of metagenomic technologies in clinical practice, supported by ongoing research to optimize workflows, reduce costs, and expand accessibility [16].

6. Limitations

We acknowledge several limitations in this study and appreciate the critical feedback that highlights areas for improvement, which we address below to foster transparency and clarity.
Specificity of pathogen identification: Some reviewers have raised concerns about the specificity of Toxoplasma gondii identification via metagenomic next-generation sequencing (NGS). While the abundance of reads and species-level confirmation using Kraken2 and Bracken strongly support the identification, we recognize the inherent challenges of metagenomic diagnostics in distinguishing true pathogens from potential background contamination. We addressed this by implementing rigorous negative controls and ensuring that human genome reads were filtered out through alignment with GRCh38. Furthermore, the correlation of metagenomic findings with clinical presentation and therapeutic response strengthens the diagnostic conclusion. However, future studies could explore additional validation methods, such as independent PCR assays, to further confirm these findings.
Pipeline reproducibility and generalizability: Concerns were expressed regarding the reproducibility of the bioinformatics pipeline used in this study. To address this, we utilized a standardized and widely adopted workflow on the Galaxy Australia platform, which has been validated in multiple prior studies. Additionally, the pipeline was tested on publicly available datasets with known microbial compositions, demonstrating consistent and accurate results. Nevertheless, bioinformatics tools and databases constantly evolve, and reproducibility may vary across platforms and laboratory conditions. Future efforts should focus on the standardization and cross-validation of pipelines across multiple institutions.
Absence of Co-Infections: Our results lack significant co-infections. While our data did not reveal notable co-pathogens, other pathogens may have been present at low abundances below the detection threshold of our sequencing depth. Increasing sequencing depth or applying enrichment techniques could provide a more comprehensive view of the microbial community in future studies. However, the clear clinical improvement with targeted therapy against Toxoplasma gondii supports its role as the primary pathogen in this case.
Therapeutic Implications: The effectiveness of metronidazole as monotherapy for Toxoplasma gondii infections was an unexpected finding that has been rightfully acknowledged scrutinized. While this response could reflect unique patient-specific factors or off-target drug effects, it diverges from established treatment protocols for toxoplasmosis. Further investigation is necessary to elucidate the underlying mechanisms and evaluate the broader applicability of this therapeutic approach. We view this as an opportunity for future research rather than a limitation of the current study.
Cost and accessibility of metagenomics: The use of NGS in routine clinical diagnostics is limited by its high cost and technical requirements. While this study demonstrates metagenomics' significant clinical and diagnostic benefits, we recognize the need for strategies to enhance its accessibility. Initiatives to lower sequencing costs, improve computational efficiency, and provide training in bioinformatics will be essential for its integration into diverse healthcare settings.
Potential for host contamination: Concerns about host contamination influencing the results are essential to consider. By using Bowtie2 to filter out human genome reads aligned to GRCh38, we minimized this issue and ensured the specificity of microbial detection. However, we acknowledge that the complete elimination of host contamination is challenging and recommend improvements in wet lab protocols, such as host DNA depletion before sequencing, in future studies.

Author Contributions

J.D.A.-E.: Conceptualization, methodology, formal analysis, investigation, resources, clinical discussion, writing—original draft preparation, writing—review and editing, visualization, supervision, funding acquisition, and standardization. J.A.G.G.: Conceptualization, clinical discussion, writing, review, and editing. D.F.P.V.: Clinical discussion, writing, review, and editing, approval of all manuscript versions. M.M.G.V.: Clinical discussion, writing, review, and editing, approval of all manuscript versions. A.M.: Bioinformatics analysis, software implementation, funding acquisition, and approval of all manuscript versions. A.H.-Y.: Bioinformatics analysis, supervision, project administration, bioinformatics pipeline standardization, data curation, funding acquisition, approval of all manuscript versions.

Funding

This research was funded by the Research Department of Universidad Internacional SEK del Ecuador, with the approval of Prof. Dr. Juan Carlos Navarro, project lead Prof. Alexander Maldonado, and subproject leaders in clinical diagnostic metagenomics Prof. Dr. JD Acosta-España and Prof. Andrés Herrera-Yela. Universidad Internacional SEK del Ecuador financially supported the sequencing. The APC was funded by a waiver from the MDPI journals, which was granted to Prof. Dr. JD Acosta-España.

Institutional Review Board Statement

Ethical review and approval were waived for this study because the sample was obtained from the biobank of the Clinical Microbiology Laboratory of SOLCA and processed at the research facilities of Universidad Internacional SEK del Ecuador, in full compliance with national and international ethical standards. According to Ecuadorian regulations (Acuerdo Ministerial No. 4882, Reglamento de Investigación en Seres Humanos, Ministerio de Salud Pública del Ecuador) and international guidelines such as the Declaration of Helsinki and CIOMS, the use of fully anonymized, previously banked human biological material for secondary research purposes does not require additional ethics committee approval, provided no identifying information is used and no additional procedures are performed on human subjects.

Informed Consent Statement

The patient provided informed consent and signed a formal authorization permitting diagnostic amplification through sequencing and subsequent metagenomic analysis.

Data Availability Statement

The sequencing data supporting this study's findings are available from the corresponding author upon reasonable request. Due to ethical and privacy considerations, access is restricted to qualified researchers with appropriate justification.

Acknowledgments

The authors gratefully acknowledge the administrative support provided by the Clinical Microbiology Laboratory at SOLCA and the technical assistance of the sequencing team at Universidad Internacional SEK del Ecuador. We also thank the Research Department for logistical coordination throughout the study. While preparing this manuscript, the authors used Grammarly (2025 version) to refine the text and ensure clarity in the presentation of methods and results. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

None to declare

Abbreviations

The following abbreviations are used in this manuscript:
MDPI

NGS

GRCh38

EuPathDB

BAL

SOLCA

HIV

AIDS
Multidisciplinary Digital Publishing Institute

Next-Generation Sequencing

Stands for Genome Reference Consortium human build 38

Eukaryotic Pathogen Database

Bronchoalveolar Lavage

Sociedad de Lucha Contra el Cáncer del Ecuador

Human Immunodeficiency Virus

Acquired Immunodeficiency Syndrome

Appendix A

Table A1. Bracken report from EuPathDB.
Table A1. Bracken report from EuPathDB.
Name Taxonomy id Kraken assigned reads Fraction total reads
Sarcocystis neurona 42890 211 0.17806
Vitrella brassicaformis 1169539 89 0.07511
Chromera velia 505693 81 0.06835
Toxoplasma gondii 5811 52 0.04388
Candida albicans 5476 35 0.02954
Fusarium oxysporum 5507 25 0.0211
Cryptosporidium baileyi 27987 17 0.01435
Eimeria falciformis 84963 15 0.01266
Eimeria mitis 44415 14 0.01181
Cyclospora cayetanensis 88456 14 0.01181
Histoplasma capsulatum 5037 14 0.01181
Eimeria praecox 51316 10 0.00844
Theileria parva 5875 10 0.00844
Coccidioides immitis 5501 10 0.00844
Magnaporthe oryzae 318829 10 0.00844
Other eukaryotic pathogens 578 0.4877
Total 1
No significant co-infections were detected; minor microbial reads fell within expected backgrounds of environmental or commensal organisms. Together, these results provide strong evidence that Toxoplasma gondii was the primary causative agent in the patient’s atypical pneumonia.

References

  1. Derouin, F.; Garin, Y.J.F. Pulmonary Toxoplasmosis in Immunocompromised Patients. European Journal of Clinical Microbiology & Infectious Diseases 1993, 12, 475–476. [Google Scholar] [CrossRef]
  2. Dubey, J.P. Toxoplasmosis of Animals and Humans; CRC Press, 2016; ISBN 9780429092954.
  3. Duan, H.; Li, X.; Mei, A.; Li, P.; Liu, Y.; Li, X.; Li, W.; Wang, C.; Xie, S. The Diagnostic Value of Metagenomic Next⁃generation Sequencing in Infectious Diseases. BMC Infect Dis 2021, 21. [Google Scholar] [CrossRef] [PubMed]
  4. Simner, P.J.; Miller, S.; Carroll, K.C. Understanding the Promises and Hurdles of Metagenomic Next-Generation Sequencing as a Diagnostic Tool for Infectious Diseases. Clin Infect Dis 2018, 66, 778–788. [Google Scholar] [CrossRef] [PubMed]
  5. Ju, C.-R.; Lian, Q.-Y.; Guan, W.-J.; Chen, A.; Zhang, J.-H.; Xu, X.; Chen, R.-C.; Li, S.-Y.; He, J.-X. Metagenomic Next-Generation Sequencing for Diagnosing Infections in Lung Transplant Recipients: A Retrospective Study. Transplant International 2022, 36. [Google Scholar] [CrossRef] [PubMed]
  6. He, S.; Wei, J.; Feng, J.; Liu, D.; Wang, N.; Chen, L.; Xiong, Y. The Application of Metagenomic Next-Generation Sequencing in Pathogen Diagnosis: A Bibliometric Analysis Based on Web of Science. Front Cell Infect Microbiol 2023, 13. [Google Scholar] [CrossRef] [PubMed]
  7. Abueg, L.A.L.; Afgan, E.; Allart, O.; Awan, A.H.; Bacon, W.A.; Baker, D.; Bassetti, M.; Batut, B.; Bernt, M.; Blankenberg, D.; et al. The Galaxy Platform for Accessible, Reproducible, and Collaborative Data Analyses: 2024 Update. Nucleic Acids Res 2024, 52, W83–W94. [Google Scholar] [CrossRef]
  8. Babraham Bioinformatics - FastQC A Quality Control Tool for High Throughput Sequence Data. Available online: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/ (accessed on 19 June 2025).
  9. Wood, D.E.; Lu, J.; Langmead, B. Improved Metagenomic Analysis with Kraken 2. Genome Biol 2019, 20, 1–13. [Google Scholar] [CrossRef] [PubMed]
  10. Aurrecoechea, C.; Brestelli, J.; Brunk, B.P.; Fischer, S.; Gajria, B.; Gao, X.; Gingle, A.; Grant, G.; Harb, O.S.; Heiges, M.; et al. EuPathDB: A Portal to Eukaryotic Pathogen Databases. Nucleic Acids Res 2009, 38, D415. [Google Scholar] [CrossRef] [PubMed]
  11. Lu, J.; Breitwieser, F.P.; Thielen, P.; Salzberg, S.L. Bracken: Estimating Species Abundance in Metagenomics Data. PeerJ Comput Sci 2017, 2017. [Google Scholar] [CrossRef] [PubMed]
  12. Ondov, B.D.; Bergman, N.H.; Phillippy, A.M. Interactive Metagenomic Visualization in a Web Browser. BMC Bioinformatics 2011, 12, 1–10. [Google Scholar] [CrossRef] [PubMed]
  13. Chiu, C.Y.; Miller, S.A. Clinical Metagenomics. Nature Reviews Genetics 2019 20:6 2019, 20, 341–355. [Google Scholar] [CrossRef] [PubMed]
  14. Kim, M.; Park, S.J.; Park, H. Trend in Serological and Molecular Diagnostic Methods for Toxoplasma Gondii Infection. Eur J Med Res 2024, 29, 520. [Google Scholar] [CrossRef] [PubMed]
  15. Hu, Z.; Weng, X.; Xu, C.; Lin, Y.; Cheng, C.; Wei, H.; Chen, W. Metagenomic Next-Generation Sequencing as a Diagnostic Tool for Toxoplasmic Encephalitis. Ann Clin Microbiol Antimicrob 2018, 17, 45. [Google Scholar] [CrossRef] [PubMed]
  16. Kalbitz, S.; Ermisch, J.; Kellner, N.; Nickel, O.; Borte, S.; Marx, K.; Lübbert, C. Metagenomic Next-Generation Sequencing as a Diagnostic Tool in the Clinical Routine of an Infectious Diseases Department: A Retrospective Cohort Study. Infection 2024, 52, 1595–1600. [Google Scholar] [CrossRef] [PubMed]
  17. Mack, D.G.; McLeod, R. New Micromethod to Study the Effect of Antimicrobial Agents on Toxoplasma Gondii: Comparison of Sulfadoxine and Sulfadiazine Individually and in Combination with Pyrimethamine and Study of Clindamycin, Metronidazole, and Cyclosporin A. Antimicrob Agents Chemother 1984, 26, 26–30. [Google Scholar] [CrossRef] [PubMed]
  18. Chew, W.K.; Segarra, I.; Ambu, S.; Mak, J.W. Significant Reduction of Brain Cysts Caused by Toxoplasma Gondii after Treatment with Spiramycin Coadministered with Metronidazole in a Mouse Model of Chronic Toxoplasmosis. Antimicrob Agents Chemother 2012, 56, 1762–1768. [Google Scholar] [CrossRef] [PubMed]
  19. Montazeri, M.; Sharif, M.; Sarvi, S.; Mehrzadi, S.; Ahmadpour, E.; Daryani, A. A Systematic Review of In Vitro and In Vivo Activities of Anti-Toxoplasma Drugs and Compounds (2006–2016). Front Microbiol 2017, 8. [Google Scholar] [CrossRef] [PubMed]
  20. Navgire, G.S.; Goel, N.; Sawhney, G.; Sharma, M.; Kaushik, P.; Mohanta, Y.K.; Mohanta, T.K.; Al-Harrasi, A. Analysis and Interpretation of Metagenomics Data: An Approach. Biol Proced Online 2022, 24, 18. [Google Scholar] [CrossRef] [PubMed]
Table 1. Total proportion of reads identified with Minikraken2.
Table 1. Total proportion of reads identified with Minikraken2.
Type of reads Number of reads %
Homo sapiens 44383861 99.67 %
Non human reads 141589 0.32 %
Unclassified 5072 0.01%
Total 44530522 100 %
Taxonomic breakdown of the non–host fraction revealed that eukaryotic sequences comprised 47.79% of these reads, with Toxoplasma gondii accounting for 32.43% (45,912 reads) (Table 2). Bacterial sequences represented 52.18%, while viral and archaeal reads were negligible (<0.02% each).
Table 2. Proportion of non-human reads identified with Minikraken.
Table 2. Proportion of non-human reads identified with Minikraken.
Type of reads Number of reads %
Eukaryota

Toxoplasma gondii

Fungi

Other eukaryotes

Unclassified eukaryotes
67678

45912

131

138

21497
47.79 %

32.43 %

0.092 %

0.097 %

15.18 %
Bacteria 73878 52.18 %
Viruses 25 0.017 %
Archaea 8 0.005 %
Total 141589 100%
Species-level abundance estimation with Bracken confirmed the dominance of T. gondii, yielding 1,031 reads per million total reads attributed to this pathogen (Kraken-assigned reads: 45,912; fraction = 0.00103) (Table 3). No other microbial species surpassed 500 rpm, and the most abundant commensals (e.g., Streptococcus salivarius, Prevotella spp.) each represented <0.05% of total reads.
Table 3. Bracken report from Minikraken2.
Table 3. Bracken report from Minikraken2.
Name Taxonomy ID Kraken assigned reads Fraction total reads
Homo sapiens 9606 44383861 0.99726
Toxoplasma gondii 5811 45912 0.00103
Streptococcus salivarius 1304 16464 0.00043
Prevotella veroralis 28137 15449 0.00035
Rothia mucilaginosa 43675 5050 0.00011
Veillonella atypica 39777 4523 0.0001
Prevotella histicola 470565 3186 0.00007
Veillonella párvula 29466 3113 0.00008
Prevotella melaninogenica 28132 2505 0.00007
Veillonella dispar 39778 2232 0.00005
Proteus phage VB_PmiS-Isfahan 1969841 1884 0.00004
Prevotella jejuni 1177574 1604 0.00004
Prevotella sp. oral taxon 299 652716 1128 0.00003
Streptococcus sp. HSISM1 1316408 875 0.00002
Streptococcus sp. FDAARGOS_192 1839799 841 0.00003
Prevotella oris 28135 841 0.00002
Streptococcus parasanguinis 1318 751 0.00002
Streptococcus thermophilus 1308 701 0.00002
Other microorganisms 8258 0.00014
Total 1
Species-level abundance estimation with Bracken confirmed the dominance of T. gondii, yielding 1,031 reads per million total reads attributed to this pathogen (Kraken-assigned reads: 45,912; fraction = 0.00103) (Table 3). No other microbial species surpassed 500 rpm, and the most abundant commensals (e.g., Streptococcus salivarius, Prevotella spp.) each represented <0.05% of total reads.
Table 4. Proportion of eukaryotic pathogen reads identified with EuPathDB.
Table 4. Proportion of eukaryotic pathogen reads identified with EuPathDB.
Type of reads Number of reads %
Phylum Apicomplexa

Sarcocystis neurona

Toxoplasma gondii

Neospora caninum

Other Apicomplexa
456

211

52

9

184
25.26 %

11.69 %

2.88 %

%

10.19 %
Phylum Trypanosomatidae 198 10.96 %
Fungi

Candida albicans

Fusarium oxysporum

Coccidioides

Other fungi
484

35

25

43

381
26.81%

1.93%

1.38%

2.38%

21.10%
Other eukaryotic pathogens 667 36.95%
Total 1805 100 %
Only Toxoplasma gondii showed a robust, clinically consistent signal, emphasizing its role as a true pathogen. The other detected taxa are likely environmental contaminants or artifacts and are not considered clinically relevant.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated