Submitted:
20 July 2025
Posted:
21 July 2025
You are already at the latest version
Abstract
The COVID-19 pandemic showcased the power of genomic surveillance in tracking infectious diseases, driving rapid public health responses, and global collaboration. This same infrastructure is being leveraged for malaria molecular surveillance (MMS) in Africa to tackle challenges like artemisinin partial resistance and Plasmodium falciparum histidine-rich protein 2 and 3 gene deletions. However, variability in reporting sequencing methods and data reporting is currently limiting the validation, comparability, and reuse of data. To maximize the impact of MMS, we propose minimal and optimal data for reporting that are key for validation and maximize transparency and FAIR (Findable, Accessible, Interoperable, Reusable) principles. Rather than focusing on specific data formats, here, we propose what should be reported and why. Moving to reporting individual infection-level polymorphism or microhaplotype data is central to maximizing the impact of MMS. Reporting must adhere to local regulatory practices and ensure proper data oversight and management, preventing data colonialism and preserving opportunities for data generators. With malaria’s challenges transcending borders, reporting and adopting standardized practices are essential to advance research and strengthen global public health efforts.
Keywords:
Manuscript
- Identifying the molecular mechanism/origin of drug and diagnostic resistance
- Monitoring the prevalence/frequency and spread of drug or diagnostic resistance markers
- Classifying outcomes in therapeutic efficacy studies (TESs) as reinfection, recrudescence, or, in the case of P. vivax, relapse
- Estimating transmission intensity
- Estimating the connectivity and movement of parasites between geographically distinct populations
- Classifying malaria cases as locally acquired or imported from another population
- Reconstructing granular patterns of transmission
Funding
Acknowledgements
Conflicts of Interest
References
- Kames J, Holcomb DD, Kimchi O, DiCuccio M, Hamasaki-Katagiri N, Wang T, Komar AA, Alexaki A, Kimchi-Sarfaty C., 2020. Sequence analysis of SARS-CoV-2 genome reveals features important for vaccine design. Scientific Reports 10: 1–11. [CrossRef]
- Abera A, Belay H, Zewude A, Gidey B, Nega D, Dufera B, Abebe A, Endriyas T, Getachew B, Birhanu H, Difabachew H, Mekonnen B, Legesse H, Bekele F, Mekete K, et al., 2020. Establishment of COVID-19 testing laboratory in resource-limited settings: challenges and prospects reported from Ethiopia. Glob Health Action 13: 1841963. [CrossRef]
- Wang L, Didelot X, Yang J, Wong G, Shi Y, Liu W, Gao GF, Bi Y., 2020. Inference of person-to-person transmission of COVID-19 reveals hidden super-spreading events during the early outbreak phase. Nature Communications 11: 1–6. [CrossRef]
- Zhang W, Govindavari JP, Davis BD, Chen SS, Kim JT, Song J, Lopategui J, Plummer JT, Vail E., 2020. Analysis of Genomic Characteristics and Transmission Routes of Patients With Confirmed SARS-CoV-2 in Southern California During the Early Stage of the US COVID-19 Pandemic. JAMA Network Open 3: e2024191. [CrossRef]
- Chan JF-W, Yuan S, Kok K-H, To KK-W, Chu H, Yang J, Xing F, Liu J, Yip CC-Y, Poon RW-S, Tsoi H-W, Lo SK-F, Chan K-H, Poon VK-M, Chan W-M, et al., 2020. A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster. Lancet (London, England) 395: 514.
- Bedford T, Greninger AL, Roychoudhury P, Starita LM, Famulare M, Huang ML, Nalla A, Pepper G, Reinhardt A, Xie H, Shrestha L, Nguyen TN, Adler A, Brandstetter E, Cho S, et al., 2020. Cryptic transmission of SARS-CoV-2 in Washington state. Science (New York, NY) 370. [CrossRef]
- Cerami C, Popkin-Hall ZR, Rapp T, Tompkins K, Zhang H, Muller MS, Basham C, Whittelsey M, Chhetri SB, Smith J, Litel C, Lin KD, Churiwal M, Khan S, Rubinstein R, et al., 2022. Household Transmission of Severe Acute Respiratory Syndrome Coronavirus 2 in the United States: Living Density, Viral Load, and Disproportionate Impact on Communities of Color. Clinical infectious diseases : an official publication of the Infectious Diseases Society of America 74. [CrossRef]
- Anon. Website. Available at: https://academic.oup.com/bioinformatics/article/34/23/4121/5001388. Accessed.
- Knock ES, Whittles LK, Lees JA, Perez-Guzman PN, Verity R, FitzJohn RG, Gaythorpe KAM, Imai N, Hinsley W, Okell LC, Rosello A, Kantas N, Walters CE, Bhatia S, Watson OJ, et al., 2021. Key epidemiological drivers and impact of interventions in the 2020 SARS-CoV-2 epidemic in England. Sci Transl Med 13. [CrossRef]
- Perkins TA, España G., 2020. Optimal Control of the COVID-19 Pandemic with Non-pharmaceutical Interventions. Bulletin of Mathematical Biology 82: 1–24. [CrossRef]
- Walker PGT, Whittaker C, Watson OJ, Baguelin M, Winskill P, Hamlet A, Djafaara BA, Cucunubá Z, Olivera Mesa D, Green W, Thompson H, Nayagam S, Ainslie KEC, Bhatia S, Bhatt S, et al., 2020. The impact of COVID-19 and strategies for mitigation and suppression in low- and middle-income countries. Science 369: 413–422. [CrossRef]
- Chiu WA, Ndeffo-Mbah ML., 2021. Using test positivity and reported case rates to estimate state-level COVID-19 prevalence and seroprevalence in the United States. PLOS Computational Biology 17: e1009374. [CrossRef]
- Ling-Hu T, Rios-Guzman E, Lorenzo-Redondo R, Ozer EA, Hultquist JF., 2022. Challenges and Opportunities for Global Genomic Surveillance Strategies in the COVID-19 Era. Viruses 14. [CrossRef]
- Anon. WHO global genomic surveillance strategy for pathogens with pandemic and epidemic potential 2022-2032. Available at: https://www.who.int/initiatives/genomic-surveillance-strategy. Accessed.
- Dalmat R, Naughton B, Kwan-Gett TS, Slyker J, Stuckey EM., 2019. Use cases for genetic epidemiology in malaria elimination. Malaria Journal 18: 1–11. [CrossRef]
- Juliano JJ, Giesbrecht DJ, Simkin A, Fola AA, Lyimo BM, Pereus D, Bakari C, Madebe RA, Seth MD, Mandara CI, Popkin-Hall ZR, Moshi R, Mbwambo RB, Niaré K, MacInnis B, et al., 2024. Prevalence of mutations associated with artemisinin partial resistance and sulfadoxine-pyrimethamine resistance in 13 regions in Tanzania in 2021: a cross-sectional survey. Lancet Microbe 5: 100920. [CrossRef]
- Ishengoma DS, Mandara CI, Bakari C, Fola AA, Madebe RA, Seth MD, Francis F, Buguzi CC, Moshi R, Garimo I, Lazaro S, Lusasi A, Aaron S, Chacky F, Mohamed A, et al., 2024. Evidence of artemisinin partial resistance in northwestern Tanzania: clinical and molecular markers of resistance. Lancet Infect Dis 24: 1225–1233. [CrossRef]
- Oyola SO, Ariani CV, Hamilton WL, Kekre M, Amenga-Etego LN, Ghansah A, Rutledge GG, Redmond S, Manske M, Jyothi D, Jacob CG, Otto TD, Rockett K, Newbold CI, Berriman M, et al., 2016. Whole genome sequencing of Plasmodium falciparum from dried blood spots using selective whole genome amplification. Malar J 15: 597. [CrossRef]
- Hathaway NJ, Parobek CM, Juliano JJ, Bailey JA., 2018. SeekDeep: single-base resolution de novo clustering for amplicon deep sequencing. Nucleic Acids Res 46: e21. [CrossRef]
- Sadler JM, Simkin A, Tchuenkam VPK, Gyuricza IG, Fola AA, Wamae K, Assefa A, Niaré K, Thwai K, White SJ, Moss WJ, Dinglasan RR, Nsango S, Tume CB, Parr JB, et al., 2024. Application of a new highly multiplexed amplicon sequencing tool to evaluate antimalarial resistance and relatedness in individual and pooled samples from Dschang, Cameroon. [CrossRef]
- Holzschuh A, Lerch A, Gerlovina I, Fakih BS, Al-mafazy A-WH, Reaves EJ, Ali A, Abbas F, Ali MH, Ali MA, Hetzel MW, Yukich J, Koepfli C., 2023. Multiplexed ddPCR-amplicon sequencing reveals isolated Plasmodium falciparum populations amenable to local elimination in Zanzibar, Tanzania. Nature Communications 14: 1–16. [CrossRef]
- LaVerriere E, Schwabl P, Carrasquilla M, Taylor AR, Johnson ZM, Shieh M, Panchal R, Straub TJ, Kuzma R, Watson S, Buckee CO, Andrade CM, Portugal S, Crompton PD, Traore B, et al., 2022. Design and implementation of multiplexed amplicon sequencing panels to serve genomic epidemiology of infectious disease: A malaria case study. Mol Ecol Resour 22: 2285–2303. [CrossRef]
- Aranda-Díaz A, Vickers EN, Murie K, Palmer B, Hathaway N, Gerlovina I, Boene S, Garcia-Ulloa M, Cisteró P, Katairo T, Semakuba FD, Nsengimaana B, Gwarinda H, García-Fernández C, Da Silva C, et al., 2024. Sensitive and modular amplicon sequencing of diversity and resistance for research and public health. [CrossRef]
- Tessema SK, Hathaway NJ, Teyssier NB, Murphy M, Chen A, Aydemir O, Duarte EM, Simone W, Colborn J, Saute F, Crawford E, Aide P, Bailey JA, Greenhouse B., 2022. Sensitive, Highly Multiplexed Sequencing of Microhaplotypes From the Plasmodium falciparum Heterozygome. J Infect Dis 225: 1227–1237. [CrossRef]
- Verity R, Aydemir O, Brazeau NF, Watson OJ, Hathaway NJ, Mwandagalirwa MK, Marsh PW, Thwai K, Fulton T, Denton M, Morgan AP, Parr JB, Tumwebaze PK, Conrad M, Rosenthal PJ, et al., 2020. The impact of antimalarial resistance on the genetic structure of Plasmodium falciparum in the DRC. Nat Commun 11: 2107. [CrossRef]
- Aydemir O, Janko M, Hathaway NJ, Verity R, Mwandagalirwa MK, Tshefu AK, Tessema SK, Marsh PW, Tran A, Reimonn T, Ghani AC, Ghansah A, Juliano JJ, Greenhouse BR, Emch M, et al., 2018. Drug-Resistance and Population Structure of Plasmodium falciparum Across the Democratic Republic of Congo Using High-Throughput Molecular Inversion Probes. J Infect Dis 218: 946–955. [CrossRef]
- Ruybal-Pesántez S, McCann K, Vibin J, Siegel S, Auburn S, Barry AE., 2024. Molecular markers for malaria genetic epidemiology: progress and pitfalls. Trends Parasitol 40: 147–163. [CrossRef]
- Early AM, Daniels RF, Farrell TM, Grimsby J, Volkman SK, Wirth DF, MacInnis BL, Neafsey DE., 2019. Detection of low-density Plasmodium falciparum infections using amplicon deep sequencing. Malar J 18: 219. [CrossRef]
- Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJA, Holmes SP., 2016. DADA2: High-resolution sample inference from Illumina amplicon data. Nat Methods 13: 581–583. [CrossRef]
- Lerch A, Koepfli C, Hofmann NE, Messerli C, Wilcox S, Kattenberg JH, Betuela I, O’Connor L, Mueller I, Felger I., 2017. Development of amplicon deep sequencing markers and data analysis pipeline for genotyping multi-clonal malaria infections. BMC Genomics 18: 864. [CrossRef]
- Bailey JA, Mvalo T, Aragam N, Weiser M, Congdon S, Kamwendo D, Martinson F, Hoffman I, Meshnick SR, Juliano JJ., 2012. Use of massively parallel pyrosequencing to evaluate the diversity of and selection on Plasmodium falciparum csp T-cell epitopes in Lilongwe, Malawi. J Infect Dis 206: 580–587. [CrossRef]
- Rosenthal PJ, Asua V, Bailey JA, Conrad MD, Ishengoma DS, Kamya MR, Rasmussen C, Tadesse FG, Uwimana A, Fidock DA., 2024. The emergence of artemisinin partial resistance in Africa: how do we respond? Lancet Infect Dis 24: e591–e600.
- Rosenthal PJ, Asua V, Conrad., 2024. Emergence, transmission dynamics and mechanisms of artemisinin partial resistance in malaria parasites in Africa. Nature reviews Microbiology 22. [CrossRef]
- Ishengoma DS, Gosling R, Martinez-Vega R, Beshir KB, Bailey JA, Chimumbwa J, Sutherland C, Conrad, Tadesse FG, Juliano JJ, Kamya MR, Mbacham WF, Ménard D, Rosenthal PJ, Raman J, et al., 2024. Urgent action is needed to confront artemisinin partial resistance in African malaria parasites. Nature medicine 30.
- Martin AC, Sadler JM, Simkin A, Musonda M, Katowa B, Matoba J, Schue J, Simulundu E, Bailey JA, Moss WJ, Juliano JJ, Fola AA., 2025. Emergence and Rising Prevalence of Artemisinin Partial Resistance Marker Kelch13 P441L in a Low Malaria Transmission Setting in Southern Zambia. [CrossRef]
- Holzschuh A, Lerch A, Nsanzabana C., 2024. Multiplexed nanopore amplicon sequencing to distinguish recrudescence from new infection in antimalarial drug trials.
- Fola AA, Feleke SM, Mohammed H, Brhane BG, Hennelly CM, Assefa A, Crudal RM, Reichert E, Juliano JJ, Cunningham J, Mamo H, Solomon H, Tasew G, Petros B, Parr JB, et al., 2023. Plasmodium falciparum resistant to artemisinin and diagnostics have emerged in Ethiopia. Nature microbiology 8. [CrossRef]
- Berhane A, Anderson K, Mihreteab S, Gresty K, Rogier E, Mohamed S, Hagos F, Embaye G, Chinorumba A, Zehaie A, Dowd S, Waters NC, Gatton ML, Udhayakumar V, Cheng Q, et al., 2018. Major Threat to Malaria Control Programs by Plasmodium falciparum Lacking Histidine-Rich Protein 2, Eritrea. Emerg Infect Dis 24: 462–470. [CrossRef]
- Feleke SM, Reichert EN, Mohammed H, Brhane BG, Mekete K, Mamo H, Petros B, Solomon H, Abate E, Hennelly C, Denton M, Keeler C, Hathaway NJ, Juliano JJ, Bailey JA, et al., 2021. Plasmodium falciparum is evolving to escape malaria rapid diagnostic tests in Ethiopia. Nat Microbiol 6: 1289–1299. [CrossRef]
- Thomson R, Parr JB, Cheng Q, Chenet S, Perkins M, Cunningham J., 2020. Prevalence of Plasmodium falciparum lacking histidine-rich proteins 2 and 3: a systematic review. Bull World Health Organ 98: 558–568F. [CrossRef]
- Mathur MB, Fox MP., 2023. Toward Open and Reproducible Epidemiology. Am J Epidemiol 192: 658–664. [CrossRef]
- Peng RD, Dominici F, Zeger SL., 2006. Reproducible epidemiologic research. Am J Epidemiol 163: 783–789.
- Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, Baak A, Blomberg N, Boiten J-W, da Silva Santos LB, Bourne PE, Bouwman J, Brookes AJ, Clark T, Crosas M, Dillo I, et al., 2016. The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data 3: 1–9. [CrossRef]
- Blasco B, Leroy D, Fidock DA., 2017. Antimalarial drug resistance: linking Plasmodium falciparum parasite biology to the clinic. Nat Med 23: 917–928.
- Fidock DA, Eastman RT, Ward SA, Meshnick SR., 2008. Recent highlights in antimalarial drug resistance and chemotherapy research. Trends Parasitol 24: 537–544.
- Ippolito MM, Moser KA, Kabuya J-BB, Cunningham C, Juliano JJ., 2021. Antimalarial Drug Resistance and Implications for the WHO Global Technical Strategy. Curr Epidemiol Rep 8: 46–62.
- Conrad MD, Rosenthal PJ., 2019. Antimalarial drug resistance in Africa: the calm before the storm? Lancet Infect Dis 19: e338–e351.
- Picot S, Olliaro P, de Monbrison F, Bienvenu A-L, Price RN, Ringwald P., 2009. A systematic review and meta-analysis of evidence for correlation between molecular markers of parasite resistance and treatment outcome in falciparum malaria. Malar J 8: 89. [CrossRef]
- Vauterin P, Jeffery B, Miles A, Amato R, Hart L, Wright I, Kwiatkowski D., 2017. Panoptes: web-based exploration of large scale genome variation data. Bioinformatics 33: 3243–3249. [CrossRef]
- Soremekun S, Conteh B, Nyassi A, Soumare HM, Etoketim B, Ndiath MO, Bradley J, D’Alessandro U, Bousema T, Erhart A, Moreno M, Drakeley C., 2024. Household-level effects of seasonal malaria chemoprevention in the Gambia. Commun Med (Lond) 4: 97.
- Thwing J, Williamson J, Cavros I, Gutman JR., 2024. Systematic Review and Meta-Analysis of Seasonal Malaria Chemoprevention. Am J Trop Med Hyg 110: 20–31. [CrossRef]
- Deutsch-Feldman M, Aydemir O, Carrel M, Brazeau NF, Bhatt S, Bailey JA, Kashamuka M, Tshefu AK, Taylor SM, Juliano JJ, Meshnick SR, Verity R., 2019. The changing landscape of Plasmodium falciparum drug resistance in the Democratic Republic of Congo. BMC Infect Dis 19: 872.
- Nankabirwa J, Brooker SJ, Clarke SE, Fernando D, Gitonga CW, Schellenberg D, Greenwood B., 2014. Malaria in school-age children in Africa: an increasingly important challenge. Trop Med Int Health 19: 1294–1309. [CrossRef]
- Okiring J, Epstein A, Namuganga JF, Kamya EV, Nabende I, Nassali M, Sserwanga A, Gonahasa S, Muwema M, Kiwuwa SM, Staedke SG, Kamya MR, Nankabirwa JI, Briggs J, Jagannathan P, et al., 2022. Gender difference in the incidence of malaria diagnosed at public health facilities in Uganda. Malar J 21: 22.
- Tessema S, Wesolowski A, Chen A, Murphy M, Wilheim J, Mupiri A-R, Ruktanonchai NW, Alegana VA, Tatem AJ, Tambo M, Didier B, Cohen JM, Bennett A, Sturrock HJ, Gosling R, et al., 2019. Using parasite genetic and human mobility data to infer local and cross-border malaria connectivity in Southern Africa. Elife 8. [CrossRef]
- Anon. WHO external quality assurance scheme for malaria nucleic acid amplification testing. Available at: https://www.who.int/teams/global-malaria-programme/case-management/diagnosis/nucleic-acid-amplification-based-diagnostics/faq-nucleic-acid-amplification-tests. Accessed.
- Cunningham JA, Thomson RM, Murphy SC, de la Paz Ade M, Ding XC, Incardona S, Legrand E, Lucchi NW, Menard D, Nsobya SL, Saez AC, Chiodini PL, Shrivastava J., 2020. WHO malaria nucleic acid amplification test external quality assessment scheme: results of distribution programmes one to three. Malar J 19: 129. [CrossRef]
- Mideo N, Kennedy DA, Carlton JM, Bailey JA, Juliano JJ, Read AF., 2013. Ahead of the curve: next generation estimators of drug resistance in malaria infections. Trends Parasitol 29: 321–328. [CrossRef]
- Ruybal-Pesántez S, Amaya-Romero J, Bérubé S, Brazeau NF, Diop MF, Hathaway N, Hendry J, McCann K, Murie K, Murphy M, Niaré K, Phelan J, Schaffner SF, Simkin A, Taylor AR, et al., 2025. Towards an open analysis ecosystem for Plasmodium genomic epidemiology.
- Anon., 2022. Seamless sharing and peer review of code. Nat Comput Sci 2: 773.
- Senn SJ., 2009. Overstating the evidence: double counting in meta-analysis and related problems. BMC Med Res Methodol 9: 10.
- Moodley K, Cengiz N, Domingo A, Nair G, Obasa AE, Lessells RJ, de Oliveira T., 2022. Ethics and governance challenges related to genomic data sharing in southern Africa: the case of SARS-CoV-2. Lancet Glob Health 10: e1855–e1859.
- Piasecki J, Cheah PY., 2022. Ownership of individual-level health data, data sharing, and data governance. BMC Medical Ethics 23: 1–9. [CrossRef]
- Bull S, Bhagwandin N., 2020. The ethics of data sharing and biobanking in health research. Wellcome Open Res 5: 270. [CrossRef]
| Variable | Minimum Standard | Optimal Standard |
| Study and Participant MetaData | ||
| Raw Sequence | All studies should provide underlying raw sequencing data for reproducibility of findings by others. | Same as minimum. |
| Raw sequencing data are the key to true reproducibility and validity of any study and should be required. Without raw data, inappropriate analyses leading to called variants or microhaplotypes can never be properly addressed. This also optimizes data for use for other scientific questions. | ||
| Metadata | All key variables as deemed de-identified used in study for the published work deposited in a sustainable uncontrolled public database (e.g., open access). | All key variables deposited in a public controlled database that allow full reanalysis and validation of the study deposited in a sustainable uncontrolled public database (e.g., access needs approval as may contain identifiable data) |
| Full metadata can potentially lead to participant identification -- although the risk of negative impact to study participants is low given malaria is a common, unstigmatized disease. Optimally, all data exactly as used in published analyses is deposited into a controlled database that allows for registered, vetted scientists to reproduce, validate, and extend work. | ||
| Methods/Code | Detailed methods used for processing sequence data, programs, settings, filtering, and analysis with metadata. | Fully reproducible coding pipeline that takes data and produces all results and figures from the main analysis. |
| While detailed written methods are key, for analysis the exact code used to analyze data and generate figures allows others to examine and check methods. Deposition of code in GitHub or similar platform is obligatory. New developing methods for code reproducibility, such as Code Ocean compute capsules, are being implemented.60 | ||
| SequencingPanel/Assay | Genomic locations sequenced and genotyped. | Complete description of panel target regions and any filtered regions that may have been ignored due to high-levels of known sequencing error. |
| Understanding the gene or genomic locations assayed by a panel allows for better integration of data. Panel design should be deposited in an easily accessible public database that is fixed in version at the time of the study. Combined with microhaplotype or allele depth, this allows for retrospective determination of reference genotypes for new mutations found later--since the older study would have only found wild type and thus not have reported a nonvariant site. Filtered regions removed due to difficult-to-assess repeats or error-prone sequences are important since underlying variation found in subsequent studies in these regions would need reanalysis. Microhaplotypes and their within-sample counts represent a compact format that is lossless and easily encodes how well a missing mutation in earlier samples sets was missed. | ||
| Controls | Set of parasite standards to provide context of sensitivity and specificity; All studies should be run with negative controls (e.g., human DNA or water). | In addition to controls, random replicate samples to assess assay variation in 5-10% of samples. |
| Laboratory-derived controls ensure consistent assay performance but cannot address the sample quality for a given experiment. Thus, repeating a percentage of samples (biological replicates) and assays (technical replicates) provides a more robust assessment of a given sample set. Ultimately, replicates (duplicates or even triplicates) can help control for noise and jackpot events, although these efforts increase costs. | ||
| Variable | Minimum Standard | Optimal Standard |
| Study and Participant MetaData | ||
| Date of collection | Month and year of collection; Start and end date of study (maximal aggregation over a year). | Individual collection date (jittered if malaria diagnosis date is considered identifying information to maintain longitudinal order at site). |
| Location of collection | Collection site or aggregated neighboring collection sites with GPS coordinate of clinic used or centroid of neighborhood. Clinic, village or town should be easily attainable. | Highest resolution data possible (GPS location of household, clinic of collection, town/city of collection) at individual level data (jittered if considered potentially identifying information). |
| Age at time of collection | Age in years at time of collection. | Age in years and months or years to a single decimal place (at study start if longitudinal). |
| Sex | As collected by the study. | As collected by the study. |
| Treatment status | Pre-treatment or post-treatment. Important to understand if frequencies or prevalences of drug resistance mutations could be skewed due to recent drug pressure in the individual. | Complicated studies with multiple time points should delineate timing of sampling -- e.g., TES. |
| Sampling strategy | Symptomatic, asymptomatic, community, clinic, etc. on a study level. | Assigned to each individual sample in cases of complex study design. |
| Travel information | If available. | Provide all travel information available at individual level data. |
| Sequencing and Genotyping Data | ||
| Variant/ haplotype calls | Nucleotide or amino acid change at variant sites called. Heterozygous or homozygous calls of known public health import. Individual level data should be reported for specific mutations, including validated resistance variants without observed variant genotypes. FAIR format variant calls such as VCF or preferably gVCF provided supplementally. With next-generation sequencing (NGS) data, reporting within-sample allele frequencies is important. This data should be unfiltered (no sites or samples removed beyond baseline initial variant e calling. |
Individual-level full microhaplotypes if generated and genotyping data (amino acid and nucleotide, indels, etc.) across all regions sequenced/variants called and provided in FAIR formats. Development of microhaplotypes that maintain linkage information and are optimal. Microhaplotypes and, to a slightly lesser extent, full GVCFs allow for examination of potential new mutations that might be captured but otherwise not recognized initially.This data should be unfiltered (no sites or samples removed beyond baseline initial variant e calling. |
| Read or UMI depth | Number of reads or unique molecular identifiers (UMI) informing each genotype (SNP or combination of SNPs) reported at each locus by individual. Total number of reads per locus reported. This is key to any quality assessment to know how much weight each sample gets. | Number of reads or unique molecular identifiers (UMI) informing each full haplotype called (not just those reported in the manuscript) at each loci by individual. Total number of reads per loci reported. Read depth provides a limited approximation of the information content, whereas UMIs provide a fuller accounting traceable to individual molecules of template in the sample. |
| Frequency (population and within sample of allele or variant) | Average allele frequency for aggregate site/region. |
Within-sample allele frequencies for each participant; these can be calculated directly or from read depth/UMI counts. |
| Allele frequency is not always reported compared to prevalence. However, frequency is much more robust to assessing sequencing error or low-level contamination. For instance, presume in 100 samples there are 10 samples with errors reporting K13 C580Y at a within sample allele frequency of 1% each. For those 100 samples it would result in a reported prevalence of 10%, but only an average population allele frequency for 580Y of 0.1%. There is concern that such errors occur when there is a high percentage of mixed infections for a given mutation. | ||
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
