Submitted:
18 September 2024
Posted:
20 September 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction

2. Background
2.1. Swin-Transformer [12]
2.2. Transformer and GNN
2.3. GNN and Classification
2.4. Fine-Grained Classification of Aircraft

3. Materials and Methods

3.1. Feature Application in Swin-Transformer



3.2. The Process of Forward and Backward Propagation

4. Experiments
4.1. Datasets and Various Artificial Labeling Criteria
- Classification Criterion 1: In this level,after merging visually indistinguishable aircraft models, we dived dataset into various model variants whose finer distinctions are visually detectable.
- Classification Criterion 2: Model variants that differ in subtle ways in criteria 1 are grouped together into families, making differences among each class more substantial, which creates a classification task of intermediate difficulty.
- Classification Criterion 3: On the basis of the criterion 2, we classify the products produced by the same manufacturer into the same category.

| Manufacture | Family | Variant |
|---|---|---|
| Airbus | A300 | A300B4 |
| A310 | A310 | |
| A320 | A318 | |
| A319 | ||
| A320 | ||
| A321 | ||
| A330 | A330-200 | |
| A330-300 | ||
| A340 | A340-200 | |
| A340-300 | ||
| A340-500 | ||
| A340-600 | ||
| A380 | A380 | |
| Manufacture | Family | Variant |
| Antonov | An-12 | An-12 |
| ATR | ATR-42 | ATR-42 |
| ATR-72 | ATR-72 | |
| British Aerospace | BAE | BAE |
| BAE-125 | BAE-125 | |
| Beechcraft | Beechcraft | Beechcraft |
| Douglas Aircraft Company | C-47 | C-47 |
| Lockheed Corporation | C-130 | C-130 |
| Cessna | Cessna | Cessna |
| Canadair | Challenger | Challenger |
| CRJ-200 | CRJ-200 | |
| CRJ-700 | CRJ-700 | |
| CRG-900 |
| Manufacture | Family | Variant |
|---|---|---|
| Boeing | Boeing 707 | 707-320 |
| Boeing 727 | 727-200 | |
| Boeing 737 | 737-200 | |
| 737-300 | ||
| 737-400 | ||
| 737-500 | ||
| 737-600 | ||
| 737-700 | ||
| 737-800 | ||
| 737-900 | ||
| Boeing 747 | 747-100 | |
| 747-200 | ||
| 747-300 | ||
| 747-400 | ||
| Boeing 757 | 757-200 | |
| 757-300 | ||
| Boeing 767 | 767-200 | |
| 767-300 | ||
| 767-400 | ||
| Boeing 777 | 777-200 | |
| 777-300 | ||
| Boeing 717 | 717 | |
| Douglas Aircraft Company | DC-3 | DC-3 |
| DC-6 | DC-6 | |
| DC-8 | DC-8 | |
| McDonnell Douglas | DC-9 | DC-9-30 |
| DC-10 | DC-10 | |
| MD-11 | MD-11 | |
| MD-80 | MD-80 | |
| MD-87 | ||
| MD-90 | MD-90 | |
| F | F |
| Manufacture | Family | Variant |
|---|---|---|
| Eurofighter | Eurofighter | Eurofighter |
| Lockheed Martin | F-16 | F-16A |
| Dassault Aviation | Falcon | Falcon |
| Fokker | Fokker | Fokker |
| Bombardier Aerospace | Global | Global |
| Gulfstream Aerospace | Gulfstream | Gulfstream |
| British Aerospace | Hawk | Hawk |
| Ilyushin | Il-76 | Il-76 |
| Lockheed Corporation | L-1011 | L-1011 |
| Fairchild | Metroliner | Metroliner |
| Beechcraft | King Air | Model |
| Piper | PA-28 | PA-28 |
| Saab | Saab | Saab |
| Supermarine | Spitfire | Spitfire |
| Cirrus Aircraft | SR-20 | SR-20 |
| Panavia | Tornado | Tornado |
| Tupolev | Tu-134 | Tu-134 |
| Tu-154 | Tu-154 | |
| Yakovlev | Yak-42 | Yak-42 |
| de Havilland | DH-82 | DH-82 |
| DHC-1 | DHC-1 | |
| DHC-6 | DHC-6 | |
| Dash 8 | DHC-8-100 | |
| DHC-8-300 | ||
| Manufacture | Family | Variant |
| Dornier | Dornier | Dornier |
| Robin | DR-400 | DR-400 |
| Embraer | Embraer E | E-170 |
| E-190 | ||
| E-195 | ||
| EMB-120 | EMB-120 | |
| Embraer | Embraer | |
| ERJ | ERJ |
4.2. Implementation Details
4.3. Evaluation Metrics
4.4. Ablation Study and Analysis
4.4.1. Ablation Studies
| Method usage | Accuracy | ||||||
|---|---|---|---|---|---|---|---|
| ✓ | 94.329 | 88.830 | 76.952 | ||||
| ✓ | ✓ | 94.629 | 91.146 | 77.632 | |||
| ✓ | ✓ | ✓ | 94.629 | 92.544 | 80.572 | ||
| ✓ | ✓ | ✓ | 94.689 | 89.340 | 75.000 | ||
| ✓ | ✓ | ✓ | ✓ | 94.779 | 97.727 | 80.645 | |
4.4.2. Further Study of Information Encoding to Various data partition criteria
| Method | (1) | (2) | (3) | (4) | (5) | (6) |
|---|---|---|---|---|---|---|
| Criteria 1 | 94.659 | 94.569 | 94.869 | 95.080 | 94.419 | 94.689 |
| Criteria 2 | 92.204 | 94.737 | 91.255 | 93.750 | 92.230 | 89.340 |
| Criteria 3 | 81.250 | 79.051 | 77.619 | 72.699 | 82.168 | 75.000 |

4.4.3. Comparison with other famous models
| Fine-grained classification model | Accuracy | ||
|---|---|---|---|
| NTS-Net | 92.643 | 92.217 | 81.626 |
| API-Net | 79.922 | 71.939 | 53.463 |
| DFL | 78.848 | 84.338 | 74.017 |
| FGVC | 94.659 | 92.173 | 75.032 |
| Bilinear-cnn | 86.169 | 84.638 | 72.067 |
| ours1 | 95.080 | 93.750 | 72.699 |
| ours2 | 94.419 | 92.230 | 82.168 |
| ours3 | 94.779 | 97.727 | 80.645 |

5. Conclusions
References
- A. Dosovitskiy, L. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, "An image is worth 16x16 words: Transformers for image recognition at scale." in International Conference on Learning Representations, 2021. [CrossRef]
- Y. Ding, Y. Y. Ding, Y. Zhou, Y. Zhu, Q. Ye, and J. Jiao, “Selective sparse sampling for fine-grained image recognition,” in Proc. IEEE/CVF Int. Conf. Comput. Vision, 2019, pp. 6598–6607. [CrossRef]
- J. Fu, H. J. Fu, H. Zheng, and T. Mei, “Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition,” in Proc. IEEE Conf. Comput. Vision Pattern Recognit., 2017, pp. 4476–4484. [CrossRef]
- C. Liu, H. C. Liu, H. Xie, Z.-J. Zha, L. Ma, L. Yu, and Y. Zhang, “Filtration and distillation: Enhancing region attention for fine-grained visual categorization,” in Proc. AAAI Conf. Artif. Intell., vol. 34, 2020, pp. 11555–11562. [CrossRef]
- Y. Peng, X. Y. Peng, X. He, and J. Zhao, “Object-part attention model for fine-grained image classification,” IEEE Trans. Image Process., vol. 27, no. 3, pp. 1487–1500, Mar. 2018. [CrossRef]
- N. Zhang, J. N. Zhang, J. Donahue, R. Girshick, and T. Darrell, “Part-based R-CNNs for fine-grained category detection,” in Proc. Eur. Conf. Comput. Vision, 2014, pp. 834–849. [CrossRef]
- X. Zhang, H. X. Zhang, H. Xiong, W. Zhou, and Q. Tian, “Fused one-vs-all features with semantic alignments for fine-grained visual categorization,” IEEE Trans. Image Process., vol. 25, no. 2, pp. 878–892, Feb. 2016. [CrossRef]
- H. Zheng, J. H. Zheng, J. Fu, Z.-J. Zha, and J. Luo, “Looking for the devil in the details: Learning trilinear attention sampling network for fine-grained image recognition,” in Proc. IEEE/CVF Conf. Comput. Vision Pattern Recognit., 2019, pp. 5007–5016. [CrossRef]
- Y. Hu, Y. Y. Hu, Y. Yang, J. Zhang, X. Cao, and X. Zhen, “Attentional kernel encoding networks for fine-grained visual categorization,” IEEE Trans. Circuits Syst. Video Technol., vol. 31, no. 1, pp. 301–314, Jan. 2021. [CrossRef]
- M. Zhou, Y. M. Zhou, Y. Bai, W. Zhang, T. Zhao, and T. Mei, “Look-into-object: Self-supervised structure modeling for object recognition,” in Proc. IEEE/CVF Conf. Comput. Vision Pattern Recognit., 2020, pp. 11771–11780. [CrossRef]
- Nuo Chen, Fenglin Liu, Chenyu You, Peilin Zhou, and Yuexian Zou. 2021. Adaptive bi-directional attention: Exploring multi-granularity representations for machine reading comprehension. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 7833–7837. [CrossRef]
- Peiyan Zhang, Haoyang Liu, Chaozhuo Li, Xing Xie, Sunqhun Kim, and Hao han Wang. 2023. Foundation Model-oriented Robustness: Robust Image Model Evaluation with Pretrained Models. arXiv:2308.10632. [CrossRef]
- Z. Liu, Y. Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 10012–10022, 21. 20 October. [CrossRef]
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention Is All You Need. [CrossRef]
- Li Yuan, Yunpeng Chen, Tao Wang, Weihao Yu, Yujun Shi, Francis EH Tay, Jiashi Feng, and Shuicheng Yan. Tokens-to-token vit: Training vision transformers from scratch on imagenet. arXiv:2101.11986. [CrossRef]
- MMSegmentation Contributors. MMSegmentation: Openmmlab semantic segmentation toolbox and benchmark. https://github.com/openmmlab/ mmsegmentation, 2020. 7.
- Kai Han, An Xiao, Enhua Wu, Jianyuan Guo, Chunjing Xu, and Yunhe Wang. Transformer in transformer. arXiv:2103.00112. [CrossRef]
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Advances in Neural Information Processing Systems, pages 5998–6008, 2017. 1. [CrossRef]
- Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2021. 1. [CrossRef]
- Kai Zhan, Yaokang Zhu, Jun Wang, and Jie Zhang. 2020. Adaptive Structural Fingerprints for Graph Attention Networks. In International Conference on Learning Representations. https://openreview.net/forum?
- Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2018. Graph attention networks. In Proceedings of the 6th International Conference on Learning Representations. [CrossRef]
- Weirui Kuang, Zhen WANG, Yaliang Li, Zhewei Wei,and Bolin Ding. 2022. Coarformer: Transformer for large graph via graph coarsening. https://openreview. net/forum?
- Gabriele Corso, Luca Cavalleri, Dominique Beaini, Pietro Liò, and Petar Veličković. 2020. Principal neighbourhood aggregation for graph nets. Advances in Neural Information Processing Systems 33 ( 2020), 13260–13271. [CrossRef]
- Jiayan Guo, Lun Du, Wendong Bi, Qiang Fu, Xiaojun Ma, Xu Chen, Shi Han, Dongmei Zhang, and Yan Zhang. 2023. Homophily-oriented Heterogeneous Graph Rewiring. In Proceedings of the ACM Web Conference 2023. 511–522. [CrossRef]
- Jiayan Guo, Lun Du, Xu Chen, Xiaojun Ma, Qiang Fu, Shi Han, Dongmei Zhang, and Yan Zhang. 2023. On Manipulating Signals of User-Item Graph: A Jacobi Polynomial-based Graph Collaborative Filtering. arXiv:2306.03624. [CrossRef]
- Jiayan Guo, Shangyang Li, and Yan Zhang. 2023. An Information Theoretic Perspective for Heterogeneous Subgraph Federated Learning. In International Conference on Database Systems for Advanced Applications. Springer, 745–760. [CrossRef]
- Jiayan Guo, Shangyang Li, Yue Zhao, and Yan Zhang. 2022. Learning robust representation through graph adversarial contrastive learning. In International Conference on Database Systems for Advanced Applications. Springer, 682–697. [CrossRef]
- Xiangnan He, Kuan Deng, Xiang Wang, Yan Li, Yongdong Zhang, and Meng Wang. 2020. Lightgcn: Simplifying and powering graph convolution network for recommendation. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval. 639–648. [CrossRef]
- Johannes Klicpera, Aleksandar Bojchevski, and Stephan Günnemann. 2018. Predict then propagate: Graph neural networks meet personalized pagerank. arXiv:1810.05997. [CrossRef]
- Johannes Klicpera, Shankari Giri, Johannes T Margraf, and Stephan Günnemann. 2020. Fast and uncertainty-aware directional message passing for non-equilibrium molecules. arXiv:2011.14115. [CrossRef]
- Johannes Klicpera, Janek Groß, and Stephan Günnemann. 2020. Directional message passing for molecular graphs. arXiv:2003.03123. [CrossRef]
- Xiaojun Ma, Qin Chen, Yuanyi Ren, Guojie Song, and Liang Wang. 2022. MetaWeight Graph Neural Network: Push the Limits Beyond Global Homophily. In Proceedings of the ACM Web Conference 2022. 1270–1280. [CrossRef]
- Junshan Wang, Guojie Song, Yi Wu,and Liang Wang. 2020. Streaming Graph Neural Networks via Continual Learning. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 1515–1524. [CrossRef]
- Peiyan Zhang, Yuchen Yan, Chaozhuo Li, Senzhang Wang, Xing Xie, Guojie Song, and Sunghun Kim. 2023. Continual Learning on Dynamic Graphs via Parameter Isolation. arXiv:2305.13825. [CrossRef]
- S.K. Zhou, H. S.K. Zhou, H. Greenspan, C. Davatzikos, J.S. Duncan, B.V. Ginneken, A. Madabhushi, J.L. Prince, D. Rueckert, R.M. I: Summers, A Review of Deep Learning in Medical Imaging, 2021. [Google Scholar] [CrossRef]
- 2006. [CrossRef]
- Feichtenbeiner A, Haas M, B¨uttner M, Grabenbauer GG, Fietkau R, Distel LV., Critical role of spatial interaction between CD8 and Foxp3 cells in human gastric cancer: the distance matters. 2014. [CrossRef]
- Shen, Yiqing and Zhou, Bingxin and Xiong, Xinye and Gao, Ruitian and Wang, Yu Guang, How GNNs Facilitate CNNs in Mining Geometric Information from Large-Scale Medical Images, 2022. [CrossRef]
- P: David and Armin, Mohammad Ali and Denman, Simon and Fookes, Clinton and Petersson, Lars, Graph-Based Deep Learning for Medical Diagnosis and Analysis, 2021. [CrossRef]
- 2016. [CrossRef]
- Ding, Kexin and Zhou, Mu and Wang, Zichen and Liu, Qiao and Arnold, Corey W. and Zhang, Shaoting and Metaxas, Dimitri N., Graph Convolutional Networks for Multi-modality Medical Imaging: Methods, Architectures, and Clinical Applications, 2022. [CrossRef]
- Liang, Zhijun and Rojas, Juan and Liu, Junfa and Guan, Yisheng, Visual Semantic Graph Attention Networks for Human-Object Interaction Detection, 2021. [CrossRef]
- Lecun, Y. and Bottou, L. and Bengio, Y. and Haffner, P. 1998. [Google Scholar] [CrossRef]
- Aditya Khosla, Nityananda Jayadevaprakash, Bangpeng Yao, and Li Fei-Fei. Novel dataset for fine-grained image categorization. In CVPR Workshop on Fine-Grained Visual Categorization, 2011.
- J. Liu, A. J. Liu, A. Kanazawa, D. Jacobs, and P. Belhumeur. Dog breed classi- fication using part localization. In Proc. ECCV, 2012. [CrossRef]
- O. Parkhi, A. O. Parkhi, A. Vedaldi, C. V. Jawahar, and A. Zisserman. Cats vs dogs. In Proc. CVPR, 2012.
- C. Wah, S. C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie. The caltech-ucsd birds-200-2011 dataset. Technical report, California Institute of Technology, 2011.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).