Submitted:
28 May 2025
Posted:
28 May 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
- Minimization of external connectivity (CH paradigm). We formalize a new principle within the Cannistraci-Hebb (CH) framework that emphasizes minimizing external local-community links (eLCL), leading to the introduction of two new models, CH3 and CH3.1.
- Adaptive model selection. We design an adaptive mechanism that automatically selects the most suitable CH model and path length for each network, based on internal validation performance. This removes the need for manual tuning and ensures better alignment with the network’s structural features. Empirically, on a benchmark of over 1000 networks, our adaptive method achieves more than twice the win rate of the best-performing baseline.
- Comprehensive static and temporal benchmark. We construct a large-scale benchmark ATLAS, consisting 1269 real-world networks (ATLAS-static) and 14 time-evolving networks (ATLAS-temporal).
- Multi-metric evaluation. We adopt three complementary evaluation metrics, Precision, NDCG, and AUPR, to capture diverse aspects of link prediction performance. Across all three metrics, our adaptive model consistently outperforms all baselines, demonstrating its robustness and general superiority under different evaluation criteria.

2. Preliminaries and Methods
2.1. Science of Physical Modelling
2.1.1. Network Automata
2.1.2. Network Automata on Paths of Length n
2.1.3. Cannistraci-Hebb Network Automata on Paths of Length n
2.1.4. CH Model Sub-Ranking Strategy
- Assign to each link in the network a weight to transform similarity into dissimilarity.
- Compute the shortest paths (SP) between all node pairs in the resulting weighted network.
- For each node pair , compute the prediction score as the Spearman’s rank correlation between the two vectors of all shortest paths from node i and from node j to every other node in the network.
- Generate a final ranking of node pairs such that pairs are first ranked by , and any ties are sub-ranked using . If both scores are tied, then the node pairs receive the same final rank.
- (Optional) Map the final ranking back to a likelihood score if a numerical prediction score is required by downstream applications (see details in Appendix G).
2.2. Engineering the Adaptive Network Automata Machine
3. Experiments
3.1. Datasets and Baselines
3.1.1. Datasets
- ATLAS-static includes 1269 undirected static networks from 14 domains such as biological, social, and economic systems (see Appendix for full details).
- ATLAS-temporal consists of 14 real-world networks with temporal snapshots representing dynamic evolution across time (see Appendix for details).
3.1.2. Baselines
3.1.3. Scale of Evaluation
3.2. Link Prediction on ATLAS-Static
3.3. Temporal Link Prediction
3.4. Path Length Preference Across Network Classes
3.5. Validation of the CH Adaptive Strategy
4. Conclusions and Discussion
Acknowledgments
Appendix A







| Method | Win rate | AUPR |
|---|---|---|
| upper bound | 1.00 | 0.30 |
| CH3-CH3.1 | 0.72 | 0.29 |
| CH2-CH3-CH3.1 | 0.72 | 0.29 |
| RA-CH2-CH3-CH3.1 | 0.70 | 0.29 |
| RA-CH3-CH3.1 | 0.69 | 0.29 |
| CH2-CH3.1 | 0.67 | 0.28 |
| CH3.1 | 0.65 | 0.28 |
| RA-CH2-CH3.1 | 0.62 | 0.28 |
| RA-CH3.1 | 0.60 | 0.28 |
| CH1-CH3-CH3.1 | 0.56 | 0.27 |
| CH1-CH2-CH3-CH3.1 | 0.56 | 0.27 |
| CH1-CH2-CH3.1 | 0.54 | 0.26 |
| RA-CH1-CH3-CH3.1 | 0.53 | 0.27 |
| RA-CH1-CH2-CH3-CH3.1 | 0.53 | 0.27 |
| CH1-CH3.1 | 0.52 | 0.26 |
| RA-CH1-CH2-CH3.1 | 0.48 | 0.26 |
| RA-CH1-CH3.1 | 0.47 | 0.26 |
| CH3 | 0.41 | 0.28 |
| CH2-CH3 | 0.41 | 0.28 |
| RA-CH3 | 0.38 | 0.28 |
| RA-CH2-CH3 | 0.38 | 0.28 |
| CH2 | 0.26 | 0.27 |
| RA-CH2 | 0.26 | 0.27 |
| CH1-CH2-CH3 | 0.26 | 0.26 |
| RA | 0.25 | 0.26 |
| CH1-CH3 | 0.25 | 0.26 |
| RA-CH1-CH2-CH3 | 0.23 | 0.25 |
| RA-CH1-CH3 | 0.22 | 0.25 |
| CH1-CH2 | 0.16 | 0.25 |
| RA-CH1-CH2 | 0.12 | 0.25 |
| RA-CH1 | 0.12 | 0.24 |
| CH1 | 0.10 | 0.23 |
Appendix B. Link Prediction Methods
Appendix B.1. Structural Perturbation Method (SPM)
- Randomly remove 10% of the links from the network adjacency matrix X, obtaining a reduced network , where R is the set of removed links.
- Compute the eigenvalues and eigenvectors of .
- Considering the set of links R as a perturbation to , construct the perturbed matrix via a first-order approximation that allows the eigenvalues to change while keeping the eigenvectors fixed.
- Repeat steps 1–3 for 10 independent iterations and take the average of the resulting perturbed matrices .
Appendix B.2. Stochastic Block Model (SBM)
Appendix B.3. HOPE
Appendix B.4. node2vec
Appendix B.5. ProNE and ProNE-SMF
Appendix B.6. NetSMF
Appendix B.7. Logistic Regression Classifier
- Create a learning set consisting of all the observed links and an equal number of non-observed links (if available; otherwise, include all non-observed links).
- Split the learning set into 5 folds for cross-validation.
-
For each cross-validation iteration :
- (a)
- Train: Train a logistic regression classifier using 4 folds and obtain the coefficient estimates .
- (b)
- Validation: Using the coefficients , obtain the likelihood scores for the remaining fold and compute the prediction performance using AUPRi,j.
Appendix B.8. MPLP and MPLP+
Appendix C. Link Prediction Evaluation
Appendix C.1. 10% Link Removal Evaluation
Appendix C.2. Temporal Evaluation
Appendix D. Datasets
Appendix D.1. ATLAS
| Class | Count |
|---|---|
| Collaboration | 18 |
| Contact | 32 |
| Covert | 86 |
| Friendship | 16 |
| PPI | 14 |
| Connectome | 529 |
| Foodweb | 71 |
| Trade | 200 |
| Transcription | 8 |
| Coauthorship | 20 |
| Flightmap | 36 |
| Internet | 162 |
| Socialnetwork | 68 |
| Software | 9 |
| Total | 1269 |
Appendix D.2. Temporal Networks
Appendix E. Compute Resources
Appendix F. Time Complexity and Runtime Analysis
Appendix F.1. Time Complexity of CHA
Appendix F.13.1. For ℓ=2.
- Path count. Each length-2 path is defined by an intermediate node z connected to both u and v. The total number of such paths is given by:where is the degree of node z. This represents the number of unique unordered two-hop paths in the network.
- Computation per path. For each length-2 path, CHA computes a score based on the iLCL and eLCL of the intermediate node z. This requires checking the neighbors of z against the local community associated with the pair , which takes time per path.
- Overall time complexity. Multiplying the path count and per-path cost gives the total time complexity:
-
Sparse, degree-homogeneous: If the graph is Sparse (i.e. ) with relatively uniform degrees (i.e., for all z), then:So the overall time complexity of .
-
Sparse, degree-heterogeneous: If the graph is sparse (i.e., ), but has a skewed degree distribution (e.g., power law), we can no longer assume for all nodes. To handle this case, we apply a relaxation via Hölder’s inequality [43] to upper-bound the root-mean-cube degree in terms of the average degree:This relaxation allows us to express the cubic-degree term in the overall complexity as:Thus, the overall time complexity in this case is .
- Dense graphs: In the worst-case scenario of dense graphs, where for all nodes, we obtain:leading to an overall time complexity of .
Appendix F.13.2. For ℓ=3
- Path count. Each length-3 path passes through a central edge . The number of such paths using as the central segment is , where and are the degrees of i and j, respectively. The total number of such paths is:
-
Computation per path. For each length-3 path , CHA computes the iLCL and eLCL of intermediate nodes i and j with respect to the seed pair .Each such computation, i.e., evaluating the iLCL/eLCL of node i with respect to , requires scanning the neighborhood of i and takes time. However, this computation is performed only once for each triplet , and the result is reused across all paths in which appears.Since each such triplet is associated with paths on average, the total cost is distributed across multiple paths. Thus, the amortized cost per path remains .
- Overall time complexity. For compact notation, we define the RMS degree–degree product over edges:and upper bound the total complexity as:
-
Sparse, degree-homogeneous: If the graph is sparse (i.e., ) with relatively uniform degrees (i.e., for all nodes), then for all edges and . This yields:So the overall time complexity of .
-
Sparse, degree-heterogeneous: If the graph is sparse (i.e., ), but has a skewed degree distribution (e.g., power law), we upper bound:In the worst case, this maximum can scale as . Since the total number of edges is , it follows that . This leads to an overall complexity:
- Dense networks: If the network is dense ( and degrees are ), then and:
Appendix F.13.3. Summary
- Sparse, degree-homogeneous: When the average degree is and degree distribution is uniform, the complexity is:
- Sparse, degree-heterogeneous: When the average degree is but degree distribution is skewed (e.g., power-law), the complexity is higher due to hubs:
- Dense networks: When the average degree is , the worst-case complexity becomes
Appendix F.13.4. Subranking Complexity
Appendix F.2. Running Time of CHA


Appendix G. Mapping Subranking to Likelihood Score
- Score-guided interpolation. Tied scores are adjusted based on the actual SPcorr values, preserving their relative magnitudes within the group. This results in a smooth, value-aware distribution of scores.
- Rank-based interpolation. Tied scores are redistributed uniformly according to their sub-rank positions, regardless of the SPcorr values. This maintains only the order but not the magnitude.
References
- Lü, L.; Zhou, T. Link prediction in complex networks: A survey. Physica A: statistical mechanics and its applications 2011, 390, 1150–1170. [Google Scholar] [CrossRef]
- Liben-Nowell, D.; Kleinberg, J. The link prediction problem for social networks. In Proceedings of the Proceedings of the twelfth international conference on Information and knowledge management, 2003, pp. 556–559.
- Cannistraci, C.V.; Alanis-Lobato, G.; Ravasi, T. From link-prediction in brain connectomes and protein interactomes to the local-community-paradigm in complex networks. Scientific reports 2013, 3, 1613. [Google Scholar] [CrossRef]
- Lü, L.; Pan, L.; Zhou, T.; Zhang, Y.C.; Stanley, H.E. Toward link predictability of complex networks. Proceedings of the National Academy of Sciences 2015, 112, 2325–2330. [Google Scholar] [CrossRef]
- Peixoto, T.P. Hierarchical block structures and high-resolution model selection in large networks. Physical Review X 2014, 4, 011047. [Google Scholar] [CrossRef]
- Ou, M.; Cui, P.; Pei, J.; Zhang, Z.; Zhu, W. Asymmetric transitivity preserving graph embedding. In Proceedings of the Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, 2016, pp. 1105–1114.
- Grover, A.; Leskovec, J. node2vec: Scalable feature learning for networks. In Proceedings of the Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, 2016, pp. 855–864.
- Qiu, J.; Dong, Y.; Ma, H.; Li, J.; Wang, C.; Wang, K.; Tang, J. NetSMF: Large-scale network embedding as sparse matrix factorization. In Proceedings of the The World Wide Web Conference; 2019; pp. 1509–1520. [Google Scholar]
- Zhang, J.; Dong, Y.; Wang, Y.; Tang, J.; Ding, M. Prone: Fast and scalable network representation learning. In Proceedings of the IJCAI; 2019; Vol. 19, pp. 4278–4284. [Google Scholar]
- Dong, K.; Guo, Z.; Chawla, N. Pure message passing can estimate common neighbor for link prediction. Advances in Neural Information Processing Systems 2024, 37, 73000–73035. [Google Scholar]
- Daminelli, S.; Thomas, J.M.; Durán, C.; Cannistraci, C.V. Common neighbours and the local-community-paradigm for topological link prediction in bipartite networks. New Journal of Physics 2015, 17, 113037. [Google Scholar] [CrossRef]
- Durán, C.; Daminelli, S.; Thomas, J.M.; Haupt, V.J.; Schroeder, M.; Cannistraci, C.V. Pioneering topological methods for network-based drug–target prediction by exploiting a brain-network self-organization theory. Briefings in bioinformatics 2018, 19, 1183–1202. [Google Scholar] [CrossRef]
- Cannistraci, C.V. Modelling self-organization in complex networks via a brain-inspired network automata theory improves link reliability in protein interactomes. Scientific Reports 2018, 8, 15760. [Google Scholar] [CrossRef] [PubMed]
- Muscoloni, A.; Michieli, U.; Cannistraci, C.V. Local-ring network automata and the impact of hyperbolic geometry in complex network link-prediction. arXiv preprint arXiv:1707.09496, arXiv:1707.09496 2017.
- Muscoloni, A.; Abdelhamid, I.; Cannistraci, C.V. Local-community network automata modelling based on length-three-paths for prediction of complex network structures in protein interactomes, food webs and more. BioRxiv, 2018; 346916. [Google Scholar]
- Zhou, T.; Lee, Y.L.; Wang, G. Experimental analyses on 2-hop-based and 3-hop-based link prediction algorithms. Physica A: Statistical Mechanics and its Applications 2021, 564, 125532. [Google Scholar] [CrossRef]
- Newman, M.E. Clustering and preferential attachment in growing networks. Physical review E 2001, 64, 025102. [Google Scholar] [CrossRef]
- Zhou, T.; Lü, L.; Zhang, Y.C. Predicting missing links via local information. The European Physical Journal B 2009, 71, 623–630. [Google Scholar] [CrossRef]
- Jaccard, P. Distribution comparée de la flore alpine dans quelques régions des Alpes occidentales et orientales. Bulletin de la Murithienne 1902, pp. 81–92.
- Kovács, I.A.; Luck, K.; Spirohn, K.; Wang, Y.; Pollis, C.; Schlabach, S.; Bian, W.; Kim, D.K.; Kishore, N.; Hao, T.; et al. Network-based prediction of protein interactions. Nature communications 2019, 10, 1240. [Google Scholar] [CrossRef] [PubMed]
- Barabási, A.L.; Albert, R. Emergence of Scaling in Random Networks. Science 1999, 286, 509–512. [Google Scholar] [CrossRef]
- Papadopoulos, F.; Kitsak, M.; Serrano, M.Á.; Boguñá, M.; Krioukov, D. Popularity versus similarity in growing networks. Nature 2012, 489, 537–540. [Google Scholar] [CrossRef]
- Muscoloni, A.; Cannistraci, C.V. A nonuniform popularity-similarity optimization (nPSO) model to efficiently generate realistic complex networks with communities. New Journal of Physics 2018, 20, 052002. [Google Scholar] [CrossRef]
- Wolfram, S.; Gad-el Hak, M. A new kind of science. Appl. Mech. Rev. 2003, 56, B18–B19. [Google Scholar] [CrossRef]
- Smith, D.M.; Onnela, J.P.; Lee, C.F.; Fricker, M.D.; Johnson, N.F. Network automata: Coupling structure and function in dynamic networks. Advances in Complex Systems 2011, 14, 317–339. [Google Scholar] [CrossRef]
- Marr, C.; Hütt, M.T. Topology regulates pattern formation capacity of binary cellular automata on graphs. Physica A: Statistical Mechanics and its Applications 2005, 354, 641–662. [Google Scholar] [CrossRef]
- Hebb, D. The Organization of Behavior. emphNew York, 1949.
- Liu, Z.; He, J.L.; Kapoor, K.; Srivastava, J. Correlations between community structure and link formation in complex networks. PloS one 2013, 8, e72908. [Google Scholar] [CrossRef]
- Pan, L.; Zhou, T.; Lü, L.; Hu, C.K. Predicting missing links and identifying spurious links via likelihood analysis. Scientific reports 2016, 6, 22955. [Google Scholar] [CrossRef]
- Tan, F.; Xia, Y.; Zhu, B. Link prediction in complex networks: a mutual information perspective. PloS one 2014, 9, e107056. [Google Scholar] [CrossRef]
- Wang, W.; Cai, F.; Jiao, P.; Pan, L. A perturbation-based framework for link prediction via non-negative matrix factorization. Scientific reports 2016, 6, 38938. [Google Scholar] [CrossRef]
- Wang, T.; Wang, H.; Wang, X. CD-Based indices for link prediction in complex network. Plos one 2016, 11, e0146727. [Google Scholar] [CrossRef]
- Pech, R.; Hao, D.; Pan, L.; Cheng, H.; Zhou, T. Link prediction via matrix completion. Europhysics Letters 2017, 117, 38002. [Google Scholar] [CrossRef]
- Shakibian, H.; Moghadam Charkari, N. Mutual information model for link prediction in heterogeneous complex networks. Scientific reports 2017, 7, 44981. [Google Scholar] [CrossRef]
- Narula, V.; Zippo, A.G.; Muscoloni, A.; Biella, G.E.M.; Cannistraci, C.V. Can local-community-paradigm and epitopological learning enhance our understanding of how local brain connectivity is able to process, learn and memorize chronic pain? Applied Network Science 2017, 2, 1–28. [Google Scholar] [CrossRef] [PubMed]
- Rees, C.L.; Moradi, K.; Ascoli, G.A. Weighing the evidence in Peters’ rule: does neuronal morphology predict connectivity? Trends in neurosciences 2017, 40, 63–71. [Google Scholar] [CrossRef] [PubMed]
- Peixoto, T.P. Efficient Monte Carlo and greedy heuristic for the inference of stochastic block models. Physical Review E 2014, 89, 012804. [Google Scholar] [CrossRef] [PubMed]
- Zhang, X.; Wang, X.; Zhao, C.; Yi, D.; Xie, Z. Degree-corrected stochastic block models and reliability in networks. Physica A: Statistical Mechanics and its Applications 2014, 393, 553–559. [Google Scholar] [CrossRef]
- Karrer, B.; Newman, M.E. Stochastic blockmodels and community structure in networks. Physical Review E—Statistical, Nonlinear, and Soft Matter Physics 2011, 83, 016107. [Google Scholar] [CrossRef]
- Peixoto, T.P. The Graph-tool Python Library. https://doi.org/10.6084/m9.figshare.1164194, 2014. [CrossRef]
- Vallès-Català, T.; Peixoto, T.P.; Sales-Pardo, M.; Guimerà, R. Consistencies and inconsistencies between model selection and link prediction in networks. Physical Review E 2018, 97, 062316. [Google Scholar] [CrossRef] [PubMed]
- Katz, L. A new status index derived from sociometric analysis. Psychometrika 1953, 18, 39–43. [Google Scholar] [CrossRef]
- Stein, E.M.; Shakarchi, R. Real Analysis: Measure Theory, Integration, and Hilbert Spaces; Princeton University Press, 2005.

| Algorithm | Field | Year | Networks | Ref. |
|---|---|---|---|---|
| SBM | Statistical Physics | 2014 | 8 | [37] |
| SBM-DC | Statistical Physics | 2014 | 5 | [38] |
| SBM-N, SBM-DC-N | Statistical Physics | 2014 | 33 | [5] |
| SPM | Quantum Physics | 2015 | 13 | [4] |
| HOPE | Computer Science | 2016 | 4 | [6] |
| node2vec | Computer Science | 2016 | 3 | [7] |
| ProNE, ProNE-SMF | Computer Science | 2019 | 5 | [9] |
| NetSMF | Computer Science | 2019 | 5 | [8] |
| MPLP, MPLP+ | Computer Science | 2024 | 15 | [10] |
| CHA | Physics & CS | 2025 | 1283 | Ours |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
