Submitted:
21 December 2023
Posted:
22 December 2023
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Preliminaries
- Masking Mechanism: GMAEs take partially masked graphs as input, where a predetermined number of nodes is intentionally masked. This selective masking reduces the amount of information the model needs to process simultaneously, leading to increased memory efficiency during training.
- Asymmetric Encoder-Decoder Architecture: The GMAE model adopts an asymmetric architecture, employing a deep transformer encoder to extract rich representations from the unmasked nodes in the graph. On the other hand, the decoder consists of a shallower transformer network. The role of the decoder is to reconstruct the features of the masked nodes based on the encoded information obtained from the encoder. This design choice may contribute to a more effective and efficient information flow within the model.
- Self-Supervised Learning: GMAEs are trained using a self-supervised learning approach. In this context, the model is tasked with predicting the features of the masked nodes from the remaining information in the graph. This self-supervised learning paradigm is advantageous as it eliminates the dependency on labeled data, which is often scarce or expensive to obtain in real-world scenarios. The model learns to capture meaningful representations and relationships within the graph by leveraging the data’s intrinsic structure.
- Randomly mask nodes in the input graph.
- Feed the non-masked nodes into the encoder and obtain their embeddings.
- Use a shared learnable mask token to represent the embeddings of the masked nodes and insert them into the output of the encoder.
- Feed the embedding matrix with inserted mask tokens into the decoder to reconstruct the features of the masked nodes.
3. Approach
- Load a graph G = (V, E) including additional information containing node features XV (with dimension dV).
-
Repeat Niter times:
- Randomly mask a fraction Fr of nodes in the input graph.
- Feed the non-masked nodes into the encoder and obtain their embeddings.
- Use a shared learnable mask token to represent the embeddings of the masked nodes and insert them into the output of the encoder.
- Calculate the similarity score for all pairs of the masked nodes using the measure S.
- Reconstruct the network of the omitted masked nodes by identifying potential links with similarity scores that meet or surpass the threshold (Tr).
- For each connection, count how many times it is rebuilt throughout the iterations.
- Focused analysis: By focusing on the ego-graph of a masked node, the model can concentrate its resources on reconstructing the missing connections for that specific node.
- Similarity-based reconstruction: GMAE utilizes similarity scores between nodes to infer potential connections. The ego graph provides a smaller, more manageable context for comparing the similarity of neighboring nodes to the masked node, making the reconstruction process more efficient.
- Threshold-based filtering: The model can set a threshold for the similarity score. Only edges with similarity scores exceeding this threshold are considered potential connections for the masked node. This helps avoid reconstructing spurious connections based on weak similarities.
4. Experiments
4.1. Cora dataset

- d= 64. (Embedding dimension)
- N_encoder_layers =4 (Number of encoder layers)
- L2=5/2708 (Numbers of neighbors in ego graphs)
- N_iter = 100. (Number of epochs in the training process)
- Fr – 20 %. (The fraction of the omitted nodes)
- S – the cosine similarity.
- Tr – 0.95. (The link prediction threshold)
| Length | 13 | 3 | 2 | 9 |
| Upper bound | 13 | 16 | 18 | 27 |
| Mean | 7 | 15 | 17.5 | 23 |

| Length | 12 | 4 | 3 | 19 |
| Upper bound | 12 | 16 | 19 | 28 |
| Mean | 6.5 | 14.5 | 18 | 29 |
4.2. CiteSeer dataset

- d= 64. (Embedding dimension)
- N_encoder_layers =4 (Number of encoder layers)
- L2=5/100 (Numbers of neighbors in ego graphs)
- N_iter = 100. (Number of epochs in the training process)
- Fr – 70 %. (The fraction of the omitted nodes)
- S – the cosine similarity.
- Tr – 0.9, 0.95. (The link prediction threshold)
4.2.1. The case of L2=5 and Tr=0.95

| Length | 9 | 5 | 5 | 25 |
| Upper bound | 9 | 14 | 19 | 44 |
| Mean | 5 | 12 | 17 | 32 |
4.2.2. The case of L2=100 and Tr=0.95

| Length | 11 | 4 | 4 | 19 |
| Upper bound | 11 | 15 | 19 | 38 |
| Mean | 11 | 15 | 19 | 38 |
4.2.3. The case of L2=100 and Tr=0.9

| Length | 11 | 4 | 4 | 19 |
| Upper bound | 11 | 15 | 19 | 38 |
| Mean | 6 | 13.5 | 17.5 | 29 |
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Chandra G. Prabha. Some aspects of citation behavior: A pilot study in business administration. Journal of the American Society for Information Science 1983, 34(3), 202–206.
- D. B., Gutierrez-Ford, C.; Peddada, S. Perceptions of Ethical Problems with Scientific Journal Peer Review: An Exploratory Study. Science and Engineering Ethics 2008, 14(3), 305-310. [CrossRef]
- Wilhite, A.; Fong, E. Coercive citation in academic publishing. Science 2012, 335(6068), 542-543. [CrossRef]
- Wren, J.D.; Georgescu, C. Detecting anomalous referencing patterns. In PubMed papers suggestive of author-centric reference list manipulation. Scientometrics 2022, 127, 5753-5771.
- M. Dong: B. Zheng; N. Quoc Viet Hung; H. Su; G. Li. Multiple rumor source detection with graph convolutional networks. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 569–578, Beijing, China, November 3 - 7, 2019.
- Y.-J. Lu ; C.-T. Li. Gcan. Graph-aware co-attention networks for explainable fake news detection on social media. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 505–514, Virtual conference July 5 - 10, 2020.
- T. Bian; X. Xiao; T. Xu; P. Zhao; W. Huang; Y. Rong; J. Huang. Rumor detection on social media with bi-directional graph convolutional networks. In Proceedings of the AAAI Conference on Artificial Intelligence, 34(1), 549–556, New York, NY, USA, February 7–12, 2020.
- S. Yu; F. Xia; Y. Sun; T. Tang; X. Yan; I. Lee. Detecting outlier patterns with query-based artificially generated searching conditions. IEEE Transactions on Computational Social Systems 2020, 8(1), 134–147.
- J. Liu; F. Xia; X. Feng; J. Ren; H. Liu. Deep Graph Learning for Anomalous Citation Detection. IEEE Transactions on Neural Networks and Learning Systems 2022, 33(6), 2543-2557. [CrossRef]
- Avros, R.; Keshet, S.; Kitai, D.T.; Vexler, E.; Volkovich, Z. Detecting Pseudo-Manipulated Citations in Scientific Literature through Perturbations of the Citation Graph. Mathematics 2023, 11, 3820. [Google Scholar] [CrossRef]
- Grover, A.; Leskovec J. Node2vec: Scalable Feature Learning for Networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, p. 855-864, Publisher: ACM. San Francisco California, 648 USA, 13 – 17 August 2016.
- V. Prakash Dwivedi and X. Bresson A Generalization of Transformer Networks to Graphs, CoRR, abs/2012.09699,2020, https://arxiv.org/abs/2012.09699.
- S. Zhang, H. Chen, H. Yang, X. Sun, Ph. S. Yu, G. Xu., Graph Masked Autoencoders with Transformers, arXiv: https://arxiv.org/abs/2202.08391.
- C. Ying, T. Cai, S. Luo, S. Zheng, G. Ke, D. He, Y. Shen, and T.-Y. Liu, titled "Do transformers really perform bad for graph representation?" , arXiv:2106.05234, 2021.
- W. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” Advances in neural information processing systems, vol. 30, 2017.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).