Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Predicting the Effect of Mutations on Protein Stability and Binding: Assessment of Leading Algorithms Performance and Databases Content with Respect to Types of Mutations

Version 1 : Received: 1 June 2023 / Approved: 2 June 2023 / Online: 2 June 2023 (11:58:21 CEST)

How to cite: Pandey, P.; Pandey, S.; Rimal, P.; Ancona, N.; Alexov, E. Predicting the Effect of Mutations on Protein Stability and Binding: Assessment of Leading Algorithms Performance and Databases Content with Respect to Types of Mutations. Preprints 2023, 2023060199. https://doi.org/10.20944/preprints202306.0199.v1 Pandey, P.; Pandey, S.; Rimal, P.; Ancona, N.; Alexov, E. Predicting the Effect of Mutations on Protein Stability and Binding: Assessment of Leading Algorithms Performance and Databases Content with Respect to Types of Mutations. Preprints 2023, 2023060199. https://doi.org/10.20944/preprints202306.0199.v1

Abstract

Development of methods and algorithms to predict the effect of mutations on protein stability, protein-protein, and protein-DNA/RNA binding is necessitated by the needs of protein engineering and for understanding the molecular mechanism of disease-causing variants. The vast majority of the leading methods are either methods with adjustable parameters or machine learning algorithms, both requiring a database of experimentally measured folding and binding free energy changes. These databases are collections of experimental data taken from scientific investigations typically aimed at probing the role of particular residue on the above-mentioned thermodynamics characteristics, i.e., the mutations are not introduced at random and do not necessarily represent mutations originating from single nucleotide variant (SNV). Thus, the reported performance of the leading algorithms assessed on these databases or other limited cases, may not be applicable for predicting the effect of SNVs seen in the human population. Indeed, we demonstrate that the SNVs and non-SNVs are not equally presented in the corresponding databases and the distribution of the free energy changes are not the same. Furthermore, the Pearson correlation coefficients (PCCs) obtained on cases involving SNVs are less impressive than for non-SNVs, indicating that caution should be used in applying them to reveal the effect of human SNVs. All methods are found to underestimate the energy changes by roughly a factor of 2.

Keywords

mutations; folding free energy change; binding free energy change; single nucleotide variant

Subject

Biology and Life Sciences, Biochemistry and Molecular Biology

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.