Submitted:
30 September 2024
Posted:
03 October 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Realted Work
2.1. Simulated Annealing
2.2. Tabu Search
2.3. Variable Neighborhood Search
2.4. Graph Based Machine Learning
3. SA for Graph Network
3.1. Hyperparameter Optimization
- Exploration and Exploitation: SA starts with a high "temperature," allowing it to explore a wide range of hyperparameter combinations by accepting both better and worse configurations with certain probabilities. As the temperature decreases, the algorithm gradually shifts to more exploitation, focusing on refining the best combinations found.
- Avoiding Local Optima: Hyperparameter optimization landscapes can have many local optima, and GNNs or GCNs are susceptible to suboptimal configurations. SA’s probabilistic acceptance of worse solutions early in the process helps in escaping local optima, leading to potentially better-performing hyperparameters. By efficiently optimizing hyperparameters, SA can improve the training performance of GNNs/GCNs, enabling the model to generalize better and avoid overfitting.
3.2. Architecture Search
- The number of convolutional layers,
- The type of aggregation functions (e.g., mean, sum, max),
- The activation functions,
- The type of skip connections or residual structures, and
- How to model multi-hop neighborhood information.
- Modifying Layer Compositions: SA can iteratively modify the number and types of layers in the GNN or GCN. For instance, early in the annealing process, SA might explore architectures with many layers, various activation functions, or complex aggregation schemes. Over time, as temperature decreases, it converges to more refined architectures that are more effective at capturing graph representations.
- Balancing Model Complexity and Performance: Because deeper or more complex GNN architectures are not always better, SA helps balance the trade-off between overfitting (when the model is too complex) and underfitting (when the model is too simple). The probabilistic nature of SA ensures that the architecture search does not get stuck in suboptimal configurations, especially in the early stages of exploration.
3.3. Node Feature Selection
- Searching for Optimal Feature Subsets: SA can iteratively explore subsets of node features, allowing for the possibility of accepting less-optimal sets early on. Over time, it converges to a more refined subset of features that yield better embeddings and improved model performance.
- Regularizing Over-Complex Feature Sets: SA can prevent over-reliance on noisy or redundant features by probabilistically eliminating them during the annealing process. This can result in more interpretable and robust GNN models, especially in domains like biology or chemistry where interpretability of node features (e.g., genes or molecular properties) is crucial.
3.4. Graph Sampling and Data Augmentation
- Dynamic Sampling Optimization: SA can dynamically optimize the node or subgraph sampling strategies during training by exploring different sampling schemes in early iterations. For instance, in the initial stages, SA may allow less optimal sampling configurations, such as smaller subgraphs or nodes with low centrality. As training progresses, it refines these sampling schemes to focus on more informative subgraphs or higher-degree nodes.
- Data Augmentation: SA can explore different graph augmentation strategies to improve generalization. By allowing worse augmentations early on and gradually focusing on better ones, SA can help the GNN train on more varied and informative samples, leading to improved robustness.
3.5. Graph Structure Learning
- Edge Pruning and Addition: SA can iteratively prune or add edges to the graph during training, allowing it to probabilistically accept worse graph structures in early iterations. As the annealing process progresses, the focus shifts towards refining the graph to improve model performance. For instance, SA can help remove noisy or irrelevant edges that might otherwise negatively impact the learning process.
- Learning Better Topologies: This is especially useful in problems where the graph structure is only partially observed (e.g., link prediction or semi-supervised learning), where SA can optimize the graph’s topology to improve the predictive performance of the GNN.
3.6. Optimization of Loss Functions
4. Tabu Search for GCN and GNN
- Hyperparameter Optimization: TS efficiently navigates the hyperparameter space (e.g., learning rate, layer depth) by utilizing a memory structure (tabu list) to avoid revisiting less promising configurations, leading to faster convergence and improved model performance [10].
- Architecture Search: TS can optimize GNN/GCN architectures by avoiding redundant exploration of previously evaluated layer structures or aggregation functions, promoting diverse and potentially more effective configurations.
- Feature Selection: TS can help select optimal node features by keeping track of previously discarded feature sets, thereby focusing on new and potentially better combinations of node attributes.
- Graph Sampling: In large-scale graphs, TS can optimize node or edge sampling strategies by avoiding inefficient sampling configurations, leading to better model efficiency without sacrificing performance.
- Graph Structure Learning: TS can optimize the graph’s topology by modifying edges (e.g., adding/removing connections) and preventing repetitive suboptimal adjustments, improving tasks like link prediction and semi-supervised learning.
5. VNS for GCN and GNN
5.1. Optimizing Hyperparameters through Neighborhood Exploration
5.2. Improving Graph Sampling Techniques
5.3. Neighborhood Search for Network Architectures
5.4. Feature Selection Using VNS
5.5. Dynamic Neighborhood Selection for Training GNNs
5.6. Escaping Local Optima in Graph Structure Learning
6. Conclusions
References
- Wang, Z.; Zhu, Y.; Li, Z.; Wang, Z.; Qin, H.; Liu, X. Graph neural network recommendation system for football formation. Applied Science and Biotechnology Journal for Advanced Research 2024, 3, 33–39. [Google Scholar] [CrossRef]
- Glover, F. Tabu Search—Part I. ORSA Journal on Computing 1989, 1, 190–206. [Google Scholar] [CrossRef]
- Vishwanathan, S.; Schraudolph, N.N.; Kondor, R.; Borgwardt, K.M. Graph kernels. Journal of Machine Learning Research 2010, 11, 1201–1242. [Google Scholar]
- Shervashidze, N.; Schweitzer, P.; Leeuwen, E.J.v.; Mehlhorn, K.; Borgwardt, K.M. Weisfeiler-Lehman graph kernels. Advances in Neural Information Processing Systems, 2011, pp. 1–9.
- Schlichtkrull, M.; Kipf, T.N.; Bloem, P.; van den Berg, R.; Titov, I.; Welling, M. Modeling relational data with graph convolutional networks. European Semantic Web Conference 2018, pp. 593–607.
- Kipf, T.N.; Welling, M. Semi-supervised classification with graph convolutional networks. International Conference on Learning Representations (ICLR), 2017.
- \, *!!! REPLACE !!!*. Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. International Conference on Learning Representations (ICLR), 2018.
- Hamilton, W.L.; Ying, R.; Leskovec, J. Inductive representation learning on large graphs. Advances in Neural Information Processing Systems, 2017, pp. 1024–1034.
- Rossi, E.; Chamberlain, B.; Frasca, F.; Eynard, D.; Monti, F.; Bronstein, M. Temporal graph networks for deep learning on dynamic graphs. arXiv preprint arXiv:2006.10637 2020. [CrossRef]
- Guo, X.; Quan, Y.; Zhao, H.; Yao, Q.; Li, Y.; Tu, W. Tabgnn: Multiplex graph neural network for tabular data prediction. arXiv preprint arXiv:2108.09127 2021. [CrossRef]
- Dorigo, M.; Gambardella, L.M. Ant Colonies for the Traveling Salesman Problem. Biosystems 1997, 43, 73–81. [Google Scholar] [CrossRef] [PubMed]
- Mladenovic, N.; Hansen, P. Variable Neighborhood Search. Computers & Operations Research 1997, 24, 1097–1100. [Google Scholar] [CrossRef]
- Johnn, S.N.; Darvariu, V.A.; Handl, J.; Kalcsics, J. A Graph Reinforcement Learning Framework for Neural Adaptive Large Neighbourhood Search. Computers & Operations Research 2024, 172, 106791. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).