Preprint
Review

This version is not peer-reviewed.

Large Language Model as a Promising Framework for the Complete Clinical Interpretation of Human Genetic Variants

Submitted:

10 May 2026

Posted:

11 May 2026

You are already at the latest version

Abstract
The number of human genetic variants cataloged in dbSNP has plateaued since 2021, with over ~1.1 billion variants housed. Since the human pangenome reference has enabled the precise identification of even structurally complex variants, capturing the entire spectrum of human genetic variants is almost achievable. However, the clinical impacts of most genetic variants still remain elusive. This is due to limitations in genome-wide association study (GWAS), the standard framework for variant interpretation, which relies solely on statistical assumptions. GWAS cannot interpret low‐frequency alleles and capture molecular interactions between variants, hindering its ability to explain complex traits and diseases. Recently, large language models (LLMs) enabled accurate inference of human genetic variants’ pathogenicity even without requiring a large sample size or prior annotations by modeling the biological principles encoded within the genome. For instance, Evolutionary Scale Modeling (ESM1b) successfully predicted missense variants in ClinVar, achieving an auROC of up to 0.905. In addition, Evo 2 classified non-coding pathogenic variants in ClinVar with an auROC of 0.987 for single nucleotide variants (SNVs) and 0.971 for non-SNVs. These results suggest that although yet limited to pathogenicity prediction, integrating multiomic and clinical data through LLM will enable the complete clinical interpretation of human genetic variants.
Keywords: 
;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated