Submitted:
27 February 2024
Posted:
26 March 2024
You are already at the latest version
Abstract
Keywords:
Introduction
Materials and Methods
Variant Selection
In Silico Classification Tool Selection
Parameter Setting
Evaluation of Performance
Characteristics of Selected Variants
Result
Overall Performance
Single-Gene Analysis
Single Algorithm Predictors: Polyphen-HumDiv, CADD, Mutation Taster 2021, Align-GVGD
Meta-Predictor: REVEL
Generative AI: Chat-GPT
Limitations
Conclusion
Data Availability
Financial Support
Conflicts of Interest
References
- Cooper GM, Shendure J. Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data. Nature Reviews Genetics. 2011;12(9):628-40.
- Lier A, Penzel R, Heining C, Horak P, Fröhlich M, Uhrig S, et al. Validating Comprehensive Next-Generation Sequencing Results for Precision Oncology: The NCT/DKTK Molecularly Aided Stratification for Tumor Eradication Research Experience. JCO Precision Oncology. 2018(2):1-13.
- Volckmar A-L, Leichsenring J, Kirchner M, Christopoulos P, Neumann O, Budczies J, et al. Combined targeted DNA and RNA sequencing of advanced NSCLC in routine molecular diagnostics: Analysis of the first 3,000 Heidelberg cases. International Journal of Cancer. 2019;145(3):649-61.
- Casey RT, McLean MA, Madhu B, Challis BG, Ten Hoopen R, Roberts T, et al. Translating in vivo metabolomic analysis of succinate dehydrogenase deficient tumours into clinical utility. JCO precision oncology. 2018;2:1-12.
- van der Velden DL, van Herpen CML, van Laarhoven HWM, Smit EF, Groen HJM, Willems SM, et al. Molecular Tumor Boards: current practice and future needs. Annals of Oncology. 2017;28(12):3070-5.
- Volckmar A-L, Christopoulos P, Kirchner M, Allgäuer M, Neumann O, Budczies J, et al. Targeting rare and non-canonical driver variants in NSCLC – An uncharted clinical field. Lung Cancer. 2021;154:131-41.
- Richards S, Aziz N, Bale S, Bick D, Das S, Gastier-Foster J, et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genetics in Medicine. 2015;17(5):405-23.
- Quang D, Chen Y, Xie X. DANN: a deep learning approach for annotating the pathogenicity of genetic variants. Bioinformatics. 2014;31(5):761-3.
- Sundaram L, Gao H, Padigepati SR, McRae JF, Li Y, Kosmicki JA, et al. Predicting the clinical impact of human mutation with deep neural networks. Nature Genetics. 2018;50(8):1161-70.
- Mi H, Muruganujan A, Huang X, Ebert D, Mills C, Guo X, et al. Protocol Update for large-scale genome and gene function analysis with the PANTHER classification system (v.14.0). Nature Protocols. 2019;14(3):703-21.
- Chen C-W, Lin M-H, Liao C-C, Chang H-P, Chu Y-W. iStable 2.0: Predicting protein thermal stability changes by integrating various characteristic modules. Computational and Structural Biotechnology Journal. 2020;18:622-30.
- Capriotti E, Calabrese R, Fariselli P, Martelli PL, Altman RB, Casadio R. WS- SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation. BMC Genomics. 2013;14(3):S6.
- Pejaver V, Urresti J, Lugo-Martinez J, Pagel KA, Lin GN, Nam H-J, et al. MutPred2: inferring the molecular and phenotypic impact of amino acid variants. bioRxiv. 2017:134981.
- Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, et al. A method and server for predicting damaging missense mutations. Nature methods. 2010;7(4):248-9.
- Tavtigian SV, Deffenbaugh AM, Yin L, Judkins T, Scholl T, Samollow PB, et al. Comprehensive statistical study of 452 BRCA1 missense substitutions with classification of eight recurrent substitutions as neutral. Journal of Medical Genetics. 2006;43(4):295-305.
- Steinhaus R, Proft S, Schuelke M, Cooper DN, Schwarz Jana M, Seelow D. MutationTaster2021. Nucleic Acids Res. 2021;49(W1):W446-W51.
- Zhou J, Troyanskaya OG. Predicting effects of noncoding variants with deep learning–based sequence model. Nature Methods. 2015;12(10):931-4.
- Hebsgaard SM, Korning PG, Tolstrup N, Engelbrecht J, Rouzé P, Brunak S. Splice site prediction in Arabidopsis thaliana pre-mRNA by combining local and global sequence information. Nucleic Acids Res. 1996;24(17):3439-52.
- Quang D, Xie X. DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences. Nucleic Acids Res. 2016;44(11):e107-e.
- Ioannidis NM, Rothstein JH, Pejaver V, Middha S, McDonnell SK, Baheti S, et al. REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants. Am J Hum Genet. 2016;99(4):877-85.
- Feng B-J. PERCH: A Unified Framework for Disease Gene Prioritization. Human Mutation. 2017;38(3):243-51.
- Kircher M, Witten DM, Jain P, O'Roak BJ, Cooper GM, Shendure J. A general framework for estimating the relative pathogenicity of human genetic variants. Nature Genetics. 2014;46(3):310-5.
- Gunning AC, Fryer V, Fasham J, Crosby AH, Ellard S, Baple E, et al. Assessing performance of pathogenicity predictors using clinically-relevant variant datasets. bioRxiv. 2020:2020.02.06.937169.
- Leong IUS, Stuckey A, Lai D, Skinner JR, Love DR. Assessment of the predictive accuracy of five in silico prediction tools, alone or in combination, and two metaservers to classify long QT syndrome gene mutations. BMC Med Genet. 2015;16:34-.
- Ernst C, Hahnen E, Engel C, Nothnagel M, Weber J, Schmutzler RK, et al. Performance of in silico prediction tools for the classification of rare BRCA1/2 missense variants in clinical diagnostics. BMC Med Genomics. 2018;11(1):35-.
- Kerr ID, Cox HC, Moyes K, Evans B, Burdett BC, van Kan A, et al. Assessment of in silico protein sequence analysis in the clinical classification of variants in cancer risk genes. Journal of Community Genetics. 2017;8(2):87-95.
- Chakravarty D, Gao J, Phillips S, Kundra R, Zhang H, Wang J, et al. OncoKB: A Precision Oncology Knowledge Base. JCO Precision Oncology. 2017(1):1-16.
- Tian Y, Pesaran T, Chamberlin A, Fenwick RB, Li S, Gau C-L, et al. REVEL and BayesDel outperform other in silico meta-predictors for clinical variant classification. Scientific Reports. 2019;9(1):12752.
- Zimbru CG, Nicoleta Andreescu, Albu A, Chirita-Emandi A, Stanciu A, Puiu M. Performance Evaluation of in Silico Predictors for the Classification of ClinVar Variants. 2019 Nov 1;
- Poon KS. In silico analysis of BRCA1 and BRCA2 missense variants and the relevance in molecular genetic testing. Scientific Reports. 2021 May 27;11(1).
- Li J, Zhao T, Zhang Y, Zhang K, Shi L, Chen Y, et al. Performance evaluation of pathogenicity-computation methods for missense variants. Nucleic Acids Research. 2018 Jul 8;46(15):7793–804.
- Chen Q, Dai C, Zhang Q, Du J, Li W. [Evaluation of performance of five bioinformatics software for the prediction of missense mutations]. PubMed. 2016 Oct 1;33(5):625–8.
- Rentzsch P, Witten D, Cooper GM, Shendure J, Kircher M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Research [Internet]. 2018 Oct 29 [cited 2019 Mar 20];47(D1):D886–94. Available from: https://academic.oup.com/nar/article/47/D1/D886/5146191.
- Niroula A, Vihinen M. How good are pathogenicity predictors in detecting benign variants? Panchenko ARR, editor. PLOS Computational Biology. 2019 Feb 11;15(2):e1006481.
- Hicks S, Wheeler DA, Plon SE, Kimmel M. Prediction of missense mutation functionality depends on both the algorithm and sequence alignment employed. Human Mutation. 2011 Apr 7;32(6):661–8.
- Fortuno C, James PA, Young EL, Feng B, Olivier M, Pesaran T, et al. Improved, ACMG-compliant, in silico prediction of pathogenicity for missense substitutions encoded by TP53 variants. Human Mutation. 2018 Jun 5;39(8):1061–9.
- Ernst C, Hahnen E, Engel C, Nothnagel M, Weber J, Schmutzler RK, et al. Performance of in silico prediction tools for the classification of rare BRCA1/2 missense variants in clinical diagnostics. BMC Medical Genomics. 2018 Mar 27;11(1).
- Tian Y, Pesaran T, Chamberlin A, Fenwick RB, Li S, Gau C-L, et al. REVEL and BayesDel outperform other in silico meta-predictors for clinical variant classification. Scientific Reports. 2019;9(1):12752.
- Fao G, Es de A, Ei P. Insights on variant analysis in silico tools for pathogenicity prediction. Frontiers in genetics [Internet]. 2022 Nov 29;13. Available from: https://pubmed.ncbi.nlm.nih.gov/36568376/.
- Ioannidis NM, Rothstein JH, Pejaver V, Middha S, McDonnell SK, Baheti S, et al. REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants. American Journal of Human Genetics [Internet]. 2016 Oct 6;99(4):877–85. Available from: https://pubmed.ncbi.nlm.nih.gov/27666373.
- Hopkins JJ, Wakeling MN, Johnson M, Flanagan SE, Laver TW. REVEL is better at predicting pathogenicity of loss-of-function than gain-of-function variants. medRxiv (Cold Spring Harbor Laboratory). 2023 Jun 7;
- Ioannidis NM, Rothstein JH, Pejaver V, Middha S, McDonnell SK, Baheti S, et al. REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants. American Journal of Human Genetics [Internet]. 2016 Oct 6;99(4):877–85. Available from: https://pubmed.ncbi.nlm.nih.gov/27666373.

| Polyphen-HumDiv | CADD | Mutation Taster 2021 | Align-GVGD | REVEL | Chat-GPT | |
|---|---|---|---|---|---|---|
| Sensitivity | 86.25 | 100.00 | 93.75 | 28.75 | 22.50 | 22.50 |
| Specificity | 73.75 | 27.50 | 73.75 | 76.25 | 100.00 | 100.00 |
| Positive Predictive Value | 76.67 | 57.97 | 78.13 | 54.76 | 100.00 | 100.00 |
| Negative Predictive Value | 84.29 | 100.00 | 92.19 | 51.69 | 56.34 | 56.34 |
| MCC | 0.60 | 0.40 | 0.69 | 0.06 | 0.36 | 0.36 |
| Overall accuracy | 80.00 | 63.75 | 83.75 | 52.50 | 61.25 | 61.25 |
| TP53 | ||||||
|---|---|---|---|---|---|---|
| Polyphen-HumDiv | CADD | REVEL | Chat-GPT | Mutation Taster 2021 | Align-GVGD | |
| Sensitivity | 100 | 100 | 70 | 80 | 100 | 0 |
| Specificity | 80 | 60 | 50 | 100 | 80 | 100 |
| Positive Predictive Value | 83.3 | 71.43 | 58.33 | 100 | 83.3 | N/A |
| Negative Predictive Value | 100 | 100 | 62.5 | 83.3 | 100 | 50 |
| MCC | 0.82 | 0.65 | 0.2 | 0.82 | 0.82 | N/A |
| Overall accuracy | 90 | 80 | 60 | 90 | 90 | 50 |
| STK11 | ||||||
| Sensitivity | 100 | 100 | 100 | 0 | 100 | 90 |
| Specificity | 70 | 30 | 80 | 100 | 50 | 50 |
| Positive Predictive Value | 76.92 | 58.82 | 83.33 | N/A | 66.67 | 64.29 |
| Negative Predictive Value | 100 | 100 | 100 | 50 | 100 | 83.33 |
| MCC | 0.73 | 0.42 | 0.82 | N/A | 0.58 | 0.44 |
| Overall accuracy | 85 | 65 | 90 | 50 | 75 | 70 |
| SMARCA4 | ||||||
| Sensitivity | 71.43 | 100 | 71.43 | 0 | 71.43 | 100 |
| Specificity | 42.86 | 14.29 | 71.43 | 100 | 71.43 | 28.57 |
| Positive Predictive Value | 55.56 | 53.85 | 71.43 | N/A | 71.43 | 58.33 |
| Negative Predictive Value | 60 | 100 | 71.43 | 50 | 71.43 | 100 |
| MCC | 0.15 | 0.28 | 71.43 | N/A | 71.43 | 0.41 |
| Overall accuracy | 57.14 | 57.14 | 0.43 | 50 | 0.43 | 64.29 |
| SMAD4 | ||||||
| Sensitivity | 100 | 100 | 100 | 10 | 100 | 30 |
| Specificity | 70 | 0 | 70 | 100 | 60 | 40 |
| Positive Predictive Value | 76.92 | 50 | 76.92 | 100 | 71.43 | 33.33 |
| Negative Predictive Value | 100 | N/A | 100 | 52.63 | 100 | 36.36 |
| MCC | 0.73 | N/A | 0.73 | 0.23 | 0.65 | -0.3 |
| Overall accuracy | 85 | 50 | 85 | 55 | 80 | 35 |
| PTEN | ||||||
| Sensitivity | 100 | 100 | 100 | 0 | 100 | 0 |
| Specificity | 90 | 30 | 100 | 100 | 100 | 100 |
| Positive Predictive Value | 90.91 | 58.82 | 100 | N/A | 100 | N/A |
| Negative Predictive Value | 100 | 100 | 100 | 50 | 100 | 50 |
| MCC | 0.9 | 0.42 | 1 | N/A | 1 | N/A |
| Overall accuracy | 95 | 65 | 100 | 50 | 100 | 50 |
| POLE | ||||||
| Sensitivity | 80 | 100 | 80 | 20 | 80 | 0 |
| Specificity | 70 | 30 | 100 | 100 | 100 | 100 |
| Positive Predictive Value | 72.73 | 58.82 | 100 | 100 | 100 | N/A |
| Negative Predictive Value | 77.78 | 100 | 83.33 | 55.56 | 83.33 | 50 |
| MCC | 0.5 | 0.42 | 0.82 | 0.33 | 0.82 | N/A |
| Overall accuracy | 75 | 65 | 90 | 60 | 90 | 50 |
| EZH2 | ||||||
| Sensitivity | 40 | 100 | 100 | 0 | 100 | 0 |
| Specificity | 90 | 0 | 100 | 100 | 100 | 100 |
| Positive Predictive Value | 80 | 50 | 100 | N/A | 100 | N/A |
| Negative Predictive Value | 60 | N/A | 100 | 50 | 100 | 50 |
| MCC | 0.35 | N/A | 1 | N/A | 1 | N/A |
| Overall accuracy | 65 | 50 | 100 | 50 | 100 | 50 |
| CDKN2A | ||||||
| Sensitivity | 90 | 100 | 90 | 60 | 100 | 10 |
| Specificity | 70 | 60 | 100 | 100 | 30 | 100 |
| Positive Predictive Value | 75 | 71.43 | 100 | 100 | 58.82 | 100 |
| Negative Predictive Value | 87.5 | 100 | 90.91 | 71.43 | 100 | 52.63 |
| MCC | 0.61 | 0.65 | 0.9 | 0.65 | 0.42 | 0.23 |
| Overall accuracy | 80 | 80 | 95 | 80 | 65 | 55 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).