Preprint
Review

This version is not peer-reviewed.

Machine Learning and Deep Learning Frameworks for Human–Virus Protein–Protein Interaction Prediction: Emerging Architectures, Methods, Benchmarks, and Challenges

Submitted:

29 May 2026

Posted:

29 May 2026

You are already at the latest version

Abstract
The outbreak of coronavirus disease 2019 (COVID-19), caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has emerged as one of the most significant global health crises in recent history. Coronaviruses are a diverse group of RNA viruses classified into alpha, beta, gamma, and delta genera, with SARS-CoV-2 belonging to the beta-coronavirus family. The virus exhibits high transmissibility and causes a wide spectrum of clinical manifestations ranging from mild respiratory symptoms to severe complications such as acute respiratory distress syndrome, multi-organ failure, and death, particularly among elderly and immunocompromised individuals. Structurally, SARS-CoV-2 possesses a large single-stranded RNA genome encoding major structural proteins, including spike (S), envelope (E), membrane (M), and nucleocapsid (N) proteins, which play critical roles in host cell recognition and viral infection. Understanding the molecular mechanisms of virus–host interactions, especially protein–protein interactions (PPIs), is essential for uncovering viral pathogenesis and identifying potential therapeutic targets. Traditional experimental techniques for PPI detection, such as yeast two-hybrid and affinity purification methods, are often expensive, labor-intensive, and prone to inaccuracies. Consequently, computational approaches based on machine learning and deep learning have gained significant attention for efficient and scalable PPI prediction. These methods utilize diverse biological information, including protein sequences, structural features, genomic data, gene ontology annotations, and interaction networks, to model complex biological relationships. This survey provides a comprehensive review of computational approaches for PPI prediction, highlighting both machine learning- and deep learning-based techniques, along with their methodological advancements and performance evaluations. Furthermore, the survey discusses major biological databases and data sources commonly employed in PPI studies, offering insights into current challenges and future directions in computational PPI prediction research.
Keywords: 
;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated