Background: Deep learning has become an important tool for predicting mutation-induced changes in binding free energy (ΔΔG). However, most current state-of-the-art methods rely heavily on paired wild-type (WT) and mutant (MT) complex structures during both training and inference. This dependence on post-mutation structural information substantially limits their practical utility in real-world scenarios, such as clinical diagnosis and early-stage drug screening, where mutant structures are difficult to obtain experimentally in a timely manner.
Methods: To evaluate model performance in more realistic and challenging translational settings, we conducted a systematic benchmark of graph-based deep learning models under a WT-only inductive setting. We constructed a full-protein heterogeneous graph framework that incorporates long-range spatial constraints to implicitly infer mutational effects from static wild-type structures. We compared it against a sequence-based vector baseline model.
Results: Through a systematic evaluation on the MdrDB dataset, we revealed a critical generalization gap. Although random splitting yielded relatively high predictive correlation due to homologous data leakage (Pearson R ≈ 0.55), model performance dropped sharply under a strict UniProt-based cross-protein split designed to simulate prediction on truly unseen targets (Pearson R ≈ 0.15). Although the absolute performance remained limited, the graph-based model showed a weak but consistent improvement over the sequence baseline, which was close to random guessing (Pearson R ≈ 0.04).
Conclusions: Further analyses suggest that the performance bottleneck may partly arise from intrinsic experimental noise in the dataset (i.e., label inconsistency) and from the absence of conformational entropy (dynamic) information in static WT structures. This study indicates that random splitting can lead to substantial overestimation of model generalizability. It highlights the need to integrate physical priors and dynamic features to overcome the current limitations of drug resistance prediction when explicit mutant structures are unavailable.