Submitted:
24 September 2025
Posted:
25 September 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Main Development Trends
2.1. Journal Distribution
2.2. Annually Published Articles
2.3. Keyword Co-Occurrence Network
3. The Datasets for Fine-Grained Interpretation
3.1. Current Status of the Dataset
3.2. Existing Deficiencies of Datasets
3.3. Future Outlook of Fine-Grained Datasets
4. Methodology Taxonomy

4.1. Fine-Grained Pixel-Level Classification or Segmentation
4.1.1. Novel Data Representation
4.1.2. Modeling Relationships Between Coarse and Fine Classes
4.1.3. Multi-Source Data Integration
4.1.4. Advanced Data Annotation Strategies.
4.2. Fine-Grained Object-Level Detection
4.2.1. Two-Stage Detectors

4.2.2. One-Stage Detectors
4.2.3. Other Methods for Fine-Grained Object Detection

4.3. Fine-Grained Scene-Level Recognition
4.3.1. Scene Classification

4.3.2. Image Retrieval
4.4. Summary of Methods
5. Discussion
5.1. Challenge
5.2. Future Directions
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Wan, M.; Zhong, G.; Wu, Q.; Zhao, X.; Lin, Y.; Lu, Y. CR-Mask RCNN: An Improved Mask RCNN Method for Airport Runway Detection and Segmentation in Remote Sensing Images. Sensors 2025, 25, 657. [CrossRef]
- Li, N.; et al. Airport Detection in Remote Sensing Real-Open World Using Deep Learning and Geographical Analysis. Engineering Applications of Artificial Intelligence 2023, 120, 106083. [CrossRef]
- Chen, F.; et al. HRTBDA: a network for post-disaster building damage assessment. Natural Hazards Review 2024. [CrossRef]
- Wu, Z.; et al. A Hybrid YOLO-E and SAM2 Approach for Damaged Building Extraction Using Multi-Source Remote Sensing Images. Sensors 2025, 25, 4375. [CrossRef]
- Yan, J.; Gu, X.; Chen, Y. CropSTS: A Remote Sensing Foundation Model for Cropland Classification with Decoupled Spatiotemporal Attention. Remote Sensing 2025, 17, 2481. [CrossRef]
- Xia, L.; et al. A precise spatiotemporal fusion crop classification framework for smallholder agricultural systems: PITT (Parcel-level Integration of Time series and Texture). Scientific Reports 2025, 15, 33351. [CrossRef]
- Zhang, S.; Cao, Y.; Bai, L.; Wu, Z. Research on Camouflage Target Classification and Recognition Based on Mid-Wave Infrared Hyperspectral Imaging. Remote Sensing 2025, 17, 1475. [CrossRef]
- Zhang, T.; Zhang, D.; Liu, Y. Research on Camouflage Target Detection Method Based on Dual Band Optics and SAR Image Fusion. In Proceedings of the Proceedings of International Conference on Image, Vision and Intelligent Systems (ICIVIS 2023). Springer, 2024, pp. 320–335. [CrossRef]
- Chen, F.; Ren, R.; Van de Voorde, T.; Xu, W.; Zhou, G.; Zhou, Y. Fast Automatic Airport Detection in Remote Sensing Images Using Convolutional Neural Networks. Remote Sensing 2018, 10, 443. [CrossRef]
- Zhang, B.; Zhao, L.; Zhang, X. Three-dimensional convolutional neural network model for tree species classification using airborne hyperspectral images. Remote Sensing of Environment 2020, 247, 111938.
- Zhu, Y.; Li, W.; Zhang, M.; Pang, Y.; Tao, R.; Du, Q. Joint feature extraction for multi-source data using similar double-concentrated network. 450, 70–79. [CrossRef]
- Yuan, S.; Lin, G.; Zhang, L.; Dong, R.; Zhang, J.; Chen, S.; Zheng, J.; Wang, J.; Fu, H. FUSU: A multi-temporal-source land use change segmentation dataset for fine-grained urban semantic understanding. Advances in Neural Information Processing Systems 2024, 37, 132417–132439.
- Liu, Z.; Yuan, L.; Weng, L.; Yang, Y. A high resolution optical satellite image dataset for ship recognition and some new baselines. In Proceedings of the International Conference on Pattern Recognition Applications and Methods. SciTePress, 2017, Vol. 2, pp. 324–331.
- Di, Y.; Jiang, Z.; Zhang, H. A public dataset for fine-grained ship classification in optical remote sensing images. Remote Sensing 2021, 13, 747.
- Zhang, Z.; Zhang, L.; Wang, Y.; Feng, P.; He, R. Shiprsimagenet: A large-scale fine-grained dataset for ship detection in high-resolution optical remote sensing images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 2021, 14, 8458–8472.
- Wang, Z.; Zhou, Y.; Wang, F.; Wang, S.; Gao, G.; Zhu, J.; Wang, P.; Hu, K. Mfbfs: High-resolution multispectral remote sensing image fine-grained building feature set. Journal of Remote Sensing 2024, 28.
- Xia, G.S.; Hu, J.; Hu, F.; Shi, B.; Bai, X.; Zhong, Y.; Zhang, L.; Lu, X. Aid: A benchmark data set for performance evaluation of aerial scene classification. IEEE Transactions on Geoscience and Remote Sensing 2017, 55, 3965–3981.
- Cheng, G.; Han, J.; Lu, X. Remote Sensing Image Scene Classification: Benchmark and State of the Art. Proceedings of the IEEE 2017, 105, 1865–1883. [CrossRef]
- Zhou, W.; Newsam, S.; Li, C.; Shao, Z. Patternnet: A benchmark dataset for performance evaluation of remote sensing image retrieval. ISPRS Journal of Photogrammetry and Remote Sensing 2018, 145, 197–209.
- Qi, X.; Zhu, P.; Wang, Y.; Zhang, L.; Peng, J.; Wu, M.; Chen, J.; Zhao, X.; Zang, N.; Mathiopoulos, P.T. Mlrsnet: A multi-label high spatial resolution remote sensing dataset for semantic scene understanding. ISPRS Journal of Photogrammetry and Remote Sensing 2020, 169, 337–350.
- Long, Y.; Xia, G.S.; Li, S.; Yang, W.; Yang, M.Y.; Zhu, X.X.; Zhang, L.; Li, D. On creating benchmark dataset for aerial image interpretation: Reviews, guidances, and million-aid. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 2021, 14, 4205–4230.
- Li, Y.; Wu, Y.; Cheng, G.; Tao, C.; Dang, B.; Wang, Y.; Zhang, J.; Zhang, C.; Liu, Y.; Tang, X.; et al. MEET: A Million-Scale Dataset for Fine-Grained Geospatial Scene Classification with Zoom-Free Remote Sensing Imagery. arXiv preprint arXiv:2503.11219 2025.
- Chen, K.; Wu, M.; Liu, J.; Zhang, C. Fgsd: A dataset for fine-grained ship detection in high resolution satellite images. arXiv preprint arXiv:2003.06832 2020.
- Huang, X.; Ren, L.; Liu, C.; Wang, Y.; Yu, H.; Schmitt, M.; Hänsch, R.; Sun, X.; Huang, H.; Mayer, H. Urban building classification (ubc)-a dataset for individual building detection and classification from satellite imagery. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 1413–1421.
- Huang, X.; Chen, K.; Tang, D.; Liu, C.; Ren, L.; Sun, Z.; Hänsch, R.; Schmitt, M.; Sun, X.; Huang, H.; et al. Urban building classification (ubc) v2—a benchmark for global building detection and fine-grained classification from satellite imagery. IEEE Transactions on Geoscience and Remote Sensing 2023, 61, 1–16.
- Liu, G.; Peng, B.; Liu, T.; Zhang, P.; Yuan, M.; Lu, C.; Cao, N.; Zhang, S.; Huang, S.; Wang, T.; et al. Large-scale fine-grained building classification and height estimation for semantic urban reconstruction: Outcome of the 2023 IEEE GRSS data fusion contest. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 2024, 17, 11194–11207.
- Wu, Z.Z.; Wan, S.H.; Wang, X.F.; Tan, M.; Zou, L.; Li, X.L.; Chen, Y. A benchmark data set for aircraft type recognition from remote sensing images. Applied Soft Computing 2020, 89, 106132.
- Yu, W.; Cheng, G.; Wang, M.; Yao, Y.; Xie, X.; Yao, X.; Han, J. Mar20: Remote sensing image military aircraft target recognition dataset. Journal of Remote Sensing 2023, 27, 2688–2696.
- Sun, X.; Wang, P.; Yan, Z.; Xu, F.; Wang, R.; Diao, W.; Chen, J.; Li, J.; Feng, Y.; Xu, T.; et al. Fair1m: A benchmark dataset for fine-grained object recognition in high-resolution remote sensing imagery. ISPRS Journal of Photogrammetry and Remote Sensing 2022, 184, 116–130.
- Xiang, X.; Xu, Z.; Deng, Y.; Zhou, Q.; Liang, Y.; Chen, K.; Zheng, Q.; Wang, Y.; Chen, X.; Gao, W. Openearthsensing: Large-scale fine-grained benchmark for open-world remote sensing. arXiv arXiv:2502.20668 2025.
- Xiao, Z.; Long, Y.; Li, D.; Wei, C.; Tang, G.; Liu, J. High-resolution remote sensing image retrieval based on cnns from a dimensional perspective. Remote Sensing 2017, 9, 725.
- Wang, Q.; Liu, S.; Chanussot, J.; Li, X. Scene classification with recurrent attention of vhr remote sensing images. IEEE Transactions on Geoscience and Remote Sensing 2018, 57, 1155–1167.
- Li, H.; Dou, X.; Tao, C.; Wu, Z.; Chen, J.; Peng, J.; Deng, M.; Zhao, L. Rsi-cb: A large-scale remote sensing image classification benchmark using crowdsourced data. Sensors 2020, 20, 1594.
- Li, Y.; Kong, D.; Zhang, Y.; Tan, Y.; Chen, L. Robust deep alignment network with remote sensing knowledge graph for zero-shot and generalized zero-shot remote sensing image scene classification. ISPRS Journal of Photogrammetry and Remote Sensing 2021, 179, 145–158.
- Hua, Y.; Mou, L.; Jin, P.; Zhu, X.X. Multiscene: A large-scale dataset and benchmark for multiscene recognition in single aerial images. IEEE Transactions on Geoscience and Remote Sensing 2021, 60, 1–13.
- Yuan, J.; Ru, L.; Wang, S.; Wu, C. Wh-mavs: A novel dataset and deep learning benchmark for multiple land use and land cover applications. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 2022, 15, 1575–1590.
- Zhao, D.; Yuan, B.; Chen, Z.; Li, T.; Liu, Z.; Li, W.; Gao, Y. Panoptic Perception: A Novel Task and Fine-Grained Dataset for Universal Remote Sensing Image Interpretation 2024. 62, 1–14. [CrossRef]
- Guo, Z.; Zhang, M.; Jia, W.; Zhang, J.; Li, W. Dual-concentrated network with morphological features for tree species classification using hyperspectral image. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 2022, 15, 7013–7024.
- Peng, Y.; Zhang, Y.; Tu, B.; Li, Q.; Li, W. Spatial–Spectral Transformer With Cross-Attention for Hyperspectral Image Classification 2022. 60, 1–15. [CrossRef]
- Han, Z.; Xu, S.; Gao, L.; Li, Z.; Zhang, B. GRetNet: Gaussian Retentive Network for Hyperspectral Image Classification 2024. 21, 1–5. [CrossRef]
- Jia, C.; Zhang, X.; Meng, H.; Xia, S.; Jiao, L. CenterFormer: A Center Spatial–Spectral Attention Transformer Network for Hyperspectral Image Classification 2025. 18, 5523–5539. [CrossRef]
- Zhao, Y.; Bao, W.; Xu, X.; Zhou, Y. E2TNet: Efficient enhancement Transformer network for hyperspectral image classification 2024. 142, 105569. [CrossRef]
- Li, Z.; Guo, F.; Li, Q.; Ren, G.; Wang, L. An Encoder–Decoder Convolution Network With Fine-Grained Spatial Information for Hyperspectral Images Classification 2020. 8, 33600–33608. [CrossRef]
- Roy, S.K.; Kar, P.; Hong, D.; Wu, X.; Plaza, A.; Chanussot, J. Revisiting Deep Hyperspectral Feature Extraction Networks via Gradient Centralized Convolution 2022. 60, 1–19. [CrossRef]
- Zhang, M.; Li, W.; Zhao, X.; Liu, H.; Tao, R.; Du, Q. Morphological Transformation and Spatial-Logical Aggregation for Tree Species Classification Using Hyperspectral Imagery 2023. 61, 1–12. [CrossRef]
- Guo, Z.; Zhang, M.; Jia, W.; Zhang, J.; Li, W. Dual-Concentrated Network With Morphological Features for Tree Species Classification Using Hyperspectral Image 2022. 15, 7013–7024. [CrossRef]
- Roy, S.K.; Deria, A.; Shah, C.; Haut, J.M.; Du, Q.; Plaza, A. Spectral–Spatial Morphological Attention Transformer for Hyperspectral Image Classification 2023. 61, 1–15. [CrossRef]
- Ji, R.; Tan, K.; Wang, X.; Tang, S.; Sun, J.; Niu, C.; Pan, C. PatchOut: A novel patch-free approach based on a transformer-CNN hybrid framework for fine-grained land-cover classification on large-scale airborne hyperspectral images 2025. 138, 104457. [CrossRef]
- Yuan, J.; Wang, S.; Wu, C.; Xu, Y. Fine-Grained Classification of Urban Functional Zones and Landscape Pattern Analysis Using Hyperspectral Satellite Imagery: A Case Study of Wuhan 2022. 15, 3972–3991. [CrossRef]
- Chen, Z.; Xu, T.; Pan, Y.; Shen, N.; Chen, H.; Li, J. Edge Feature Enhancement for Fine-Grained Segmentation of Remote Sensing Images 2024. 62, 1–13. [CrossRef]
- Chen, Y.; Huang, L.; Zhu, L.; Yokoya, N.; Jia, X. Fine-Grained Classification of Hyperspectral Imagery Based on Deep Learning 2019. 11, 2690. [CrossRef]
- Miao, J.; Zhang, B.; Wang, B. Coarse-to-Fine Joint Distribution Alignment for Cross-Domain Hyperspectral Image Classification 2021. 14, 12415–12428. [CrossRef]
- Wu, H.; Xue, Z.; Zhou, S.; Su, H. Overcoming Granularity Mismatch in Knowledge Distillation for Few-Shot Hyperspectral Image Classification 2025. 63, 1–17. [CrossRef]
- Huang, Y.; Peng, J.; Zhang, G.; Sun, W.; Chen, N.; Du, Q. Adversarial Domain Adaptation Network With Calibrated Prototype and Dynamic Instance Convolution for Hyperspectral Image Classification 2024. 62, 1–13. [CrossRef]
- Ma, Y.; Deng, X.; Wei, J. Land Use Classification of High-Resolution Multispectral Satellite Images With Fine-Grained Multiscale Networks and Superpixel Postprocessing 2023. 16, 3264–3278. [CrossRef]
- Zhao, C.; Chen, M.; Feng, S.; Qin, B.; Zhang, L. A Coarse-to-Fine Semisupervised Learning Method Based on Superpixel Graph and Breaking-Tie Sampling for Hyperspectral Image Classification 2023. 20, 1–5. [CrossRef]
- Ni, K.; Xie, Y.; Zhao, G.; Zheng, Z.; Wang, P.; Lu, T. Coarse-to-Fine High-Order Network for Hyperspectral and LiDAR Classification 2025. 63, 1–16. [CrossRef]
- Liu, Y.; Ye, Z.; Xi, Y.; Liu, H.; Li, W.; Bai, L. Multiscale and Multidirection Feature Extraction Network for Hyperspectral and LiDAR Classification 2024. 17, 9961–9973. [CrossRef]
- Liu, Z.; Li, J.; Wang, L.; Plaza, A. Integration of Remote Sensing and Crowdsourced Data for Fine-Grained Urban Flood Detection 2024. 17, 13523–13532. [CrossRef]
- Bai, J.; Yuan, A.; Xiao, Z.; Zhou, H.; Wang, D.; Jiang, H.; Jiao, L. Class Incremental Learning With Few-Shots Based on Linear Programming for Hyperspectral Image Classification 2022. 52, 5474–5485. [CrossRef]
- Ouyang, L.; Guo, G.; Fang, L.; Ghamisi, P.; Yue, J. PCLDet: Prototypical Contrastive Learning for Fine-Grained Object Detection in Remote Sensing Images 2023. 61, 1–11. [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 580–587.
- Girshick, R. Fast r-cnn. In Proceedings of the Proceedings of the IEEE international conference on computer vision, 2015, pp. 1440–1448.
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. In Proceedings of the Advances in neural information processing systems, 2015, pp. 91–99.
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask r-cnn. In Proceedings of the Proceedings of the IEEE international conference on computer vision, 2017, pp. 2961–2969.
- Cai, Z.; Vasconcelos, N. Cascade r-cnn: Delving into high quality object detection. In Proceedings of the Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 6154–6162.
- Han, Y.; Yang, X.; Pu, T.; Peng, Z. Fine-Grained Recognition for Oriented Ship Against Complex Scenes in Optical Remote Sensing Images 2022. 60, 1–18. [CrossRef]
- Sumbul, G.; Cinbis, R.G.; Aksoy, S. Multisource Region Attention Network for Fine-Grained Object Recognition in Remote Sensing Imagery 2019-07. 57, 4929–4937, [1901.06403 [cs]]. [CrossRef]
- Guo, B.; Zhang, R.; Guo, H.; Yang, W.; Yu, H.; Zhang, P.; Zou, T. Fine-Grained Ship Detection in High-Resolution Satellite Images With Shape-Aware Feature Learning 2023. 16, 1914–1926. [CrossRef]
- Cheng, J.; Yao, X.; Yang, X.; Yuan, X.; Feng, X.; Cheng, G.; Huang, X.; Han, J. DIMA: Digging Into Multigranular Archetype for Fine-Grained Object Detection 2024. 62, 1–14. [CrossRef]
- Wang, L.; Zhang, J.; Tian, J.; Li, J.; Zhuo, L.; Tian, Q. Efficient Fine-Grained Object Recognition in High-Resolution Remote Sensing Images From Knowledge Distillation to Filter Grafting 2023. 61, 1–16. [CrossRef]
- Zeng, L.; Guo, H.; Yang, W.; Yu, H.; Yu, L.; Zhang, P.; Zou, T. Instance Switching-Based Contrastive Learning for Fine-Grained Airplane Detection 2022. 60, 1–16. [CrossRef]
- Li, W.; Zhao, D.; Yuan, B.; Gao, Y.; Shi, Z. PETDet: Proposal Enhancement for Two-Stage Fine-Grained Object Detection 2024. 62, 1–14. [CrossRef]
- Cheng, G.; Li, Q.; Wang, G.; Xie, X.; Min, L.; Han, J. SFRNet: Fine-Grained Oriented Object Recognition via Separate Feature Refinement 2023. 61, 1–10. [CrossRef]
- Ouyang, L.; Fang, L.; Ji, X. Multigranularity Self-Attention Network for Fine-Grained Ship Detection in Remote Sensing Images 2022. 15, 9722–9732. [CrossRef]
- Liu, Y.; Liu, J.; Li, X.; Wei, L.; Wu, Z.; Han, B.; Dai, W. Exploiting Discriminating Features for Fine-Grained Ship Detection in Optical Remote Sensing Images 2024. 17, 20098–20115. [CrossRef]
- Yang, Y.; Zhang, Z.; Feng, P.; Yan, Y.; He, G.; Liu, S.; Zhang, P.; Gao, H. HMS-Net: A Hierarchical Multilabel Fine-Grained Ship Detection Network in Remote Sensing Images 2025. 18, 15394–15411. [CrossRef]
- Zhu, Z.; Sun, X.; Diao, W.; Chen, K.; Xu, G.; Fu, K. Invariant Structure Representation for Remote Sensing Object Detection Based on Graph Modeling 2022. 60, 1–17. [CrossRef]
- Li, Y.; Chen, L.; Li, W. Fine-Grained Ship Recognition With Spatial-Aligned Feature Pyramid Network and Adaptive Prototypical Contrastive Learning 2025. 63, 1–13. [CrossRef]
- Gong, T.; Cheng, W.; Chen, Y.; Xiong, S.; Lu, X. Discover the Unknown Ones in Fine-Grained Ship Detection 2025. 63, 1–14. [CrossRef]
- Chen, X.; Chen, X.; Ge, X.; Chen, J.; Wang, H. Online Decoupled Distillation Based on Prototype Contrastive Learning for Lightweight Underwater Object Detection Models 2025. 63, 1–14. [CrossRef]
- Guo, H.; Liu, Y.; Pan, Z.; Hu, Y. Advancing Fine-Grained Few-Shot Object Detection on Remote Sensing Images with Decoupled Self-Distillation and Progressive Prototype Calibration 2025-01. 17, 495. Publisher: Multidisciplinary Digital Publishing Institute, . [CrossRef]
- Lu, X.; Sun, X.; Diao, W.; Mao, Y.; Li, J.; Zhang, Y.; Wang, P.; Fu, K. Few-Shot Object Detection in Aerial Imagery Guided by Text-Modal Knowledge 2023. 61, 1–19. [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779–788.
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the European conference on computer vision. Springer, 2016, pp. 21–37.
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal loss for dense object detection. In Proceedings of the Proceedings of the IEEE international conference on computer vision, 2017, pp. 2980–2988.
- Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 2018.
- Tan, M.; Pang, R.; Le, Q.V. Efficientdet: Scalable and efficient object detection. In Proceedings of the Proceedings of the IEEE/cvf conference on computer vision and pattern recognition, 2020, pp. 10781–10790.
- Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv preprint arXiv:2207.02696 2022.
- Zhang, Y.; Li, S.; Wang, H.; Liu, Y.; Zhang, J. Aircraft Target Detection in Remote Sensing Images Based on Improved YOLOv7-Tiny Network. IEEE Geoscience and Remote Sensing Letters 2024, 21, 1–5. [CrossRef]
- Chen, Y.; Liu, J.; Zhang, Y.; Li, W.; Wang, H. A Remote Sensing Target Detection Model Based on Lightweight Feature Enhancement and Feature Refinement Extraction. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 2024, 17, 5265–5279. [CrossRef]
- Luo, Y.; Xiong, G.; Li, X.; Wang, Z.; Chen, J. An Improved YOLOv8 Detector for Multi-Scale Target Detection in Remote Sensing Images. IEEE Transactions on Geoscience and Remote Sensing 2024, 62, 1–16. [CrossRef]
- Li, M.; Zhang, W.; Wang, Q.; Zhao, Y. YOLO-RS: Remote Sensing Enhanced Crop Detection Methods, 2025, [arXiv:cs.CV/2504.11165].
- Wang, C.; Li, J.; Zhang, H.; Liu, X. YOLOX-DW: A Fine-Grained Object Detection Algorithm for Remote Sensing Images. Remote Sensing 2024. [CrossRef]
- Xi, Y.; Jia, W.; Miao, Q.; Feng, J.; Ren, J.; Luo, H. Detection-Driven Exposure-Correction Network for Nighttime Drone-View Object Detection 2024. 62, 1–14. [CrossRef]
- Yang, J.; Fu, K.; Wu, Y.; Diao, W.; Dai, W.; Sun, X. Mutual-Feed Learning for Super-Resolution and Object Detection in Degraded Aerial Imagery 2022. 60, 1–16. [CrossRef]
- Wu, J.; Zhao, F.; Yao, G.; Jin, Z. FGA-YOLO: A one-stage and high-precision detector designed for fine-grained aircraft recognition 2025-02-14. 618, 129067. [CrossRef]
- Zhao, S.; Chen, H.; Zhang, D.; Tao, Y.; Feng, X.; Zhang, D. SR-YOLO: Spatial-to-Depth Enhanced Multi-Scale Attention Network for Small Target Detection in UAV Aerial Imagery 2025-01. 17, 2441. Publisher: Multidisciplinary Digital Publishing Institute, . [CrossRef]
- Wu, F.; Hu, T.; Xia, Y.; Ma, B.; Sarwar, S.; Zhang, C. WDFA-YOLOX: A Wavelet-Driven and Feature-Enhanced Attention YOLOX Network for Ship Detection in SAR Images 2024-01. 16, 1760. Publisher: Multidisciplinary Digital Publishing Institute, . [CrossRef]
- Song, Y.; Wang, S.; Li, Q.; Mu, H.; Feng, R.; Tian, T.; Tian, J. Vehicle Target Detection Method for Wide-Area SAR Images Based on Coarse-Grained Judgment and Fine-Grained Detection 2023-01. 15, 3242. Publisher: Multidisciplinary Digital Publishing Institute, . [CrossRef]
- Zhang, J.; Zhang, Y.; Shi, Z.; Zhang, Y.; Gao, R. Unmanned Aerial Vehicle Object Detection Based on Information-Preserving and Fine-Grained Feature Aggregation 2024-01. 16, 2590. Publisher: Multidisciplinary Digital Publishing Institute, . [CrossRef]
- Xi, Y.; Jia, W.; Miao, Q.; Liu, X.; Fan, X.; Li, H. FiFoNet: Fine-Grained Target Focusing Network for Object Detection in UAV Images 2022-01. 14, 3919. Publisher: Multidisciplinary Digital Publishing Institute, . [CrossRef]
- Ma, S.; Wang, W.; Pan, Z.; Hu, Y.; Zhou, G.; Wang, Q. A Recognition Model Incorporating Geometric Relationships of Ship Components 2024-01. 16, 130. Publisher: Multidisciplinary Digital Publishing Institute, . [CrossRef]
- Jiang, X.N.; Niu, X.Q.; Wu, F.L.; Fu, Y.; Bao, H.; Fan, Y.C.; Zhang, Y.; Pei, J.Y. A Fine-Grained Aircraft Target Recognition Algorithm for Remote Sensing Images Based on YOLOV8 2025. 18, 4060–4073. [CrossRef]
- Huang, Q.; Yao, R.; Lu, X.; Zhu, J.; Xiong, S.; Chen, Y. Oriented Object Detector With Gaussian Distribution Cost Label Assignment and Task-Decoupled Head 2024. 62, 1–16. [CrossRef]
- Liu, S.; Yang, Z.; Li, Q.; Wang, Q. InterMamba: A Visual-Prompted Interactive Framework for Dense Object Detection and Annotation 2025. 63, 1–11. [CrossRef]
- Su, Y.; Zhang, T.; Li, F. SA-YOLO: Self-Adaptive Loss Function for Imbalanced Sample Detection. Journal of Electronics and Information Technology 2024, 46, 123–134.
- Yang, B.; Han, J.; Hou, X.; Zhou, D.; Liu, W.; Bi, F. FSDA-DETR: Few-Shot Domain-Adaptive Object Detection Transformer in Remote Sensing Imagery 2025. 63, 1–16. [CrossRef]
- Wang, B.; Sui, H.; Ma, G.; Zhou, Y.; Zhou, M. GMODet: A Real-Time Detector for Ground-Moving Objects in Optical Remote Sensing Images With Regional Awareness and Semantic–Spatial Progressive Interaction 2025. 63, 1–23. [CrossRef]
- Xu, X.; Chen, Z.; Zhang, X.; Wang, G. Context-Aware Content Interaction: Grasp Subtle Clues for Fine-Grained Aircraft Detection 2024. 62, 1–19. [CrossRef]
- Sumbul, G.; Cinbis, R.G.; Aksoy, S. Fine-Grained Object Recognition and Zero-Shot Learning in Remote Sensing Imagery 2018-02. 56, 770–779. [CrossRef]
- Zhang, J.; Zhong, Z.; Wei, X.; Wu, X.; Li, Y. Remote Sensing Image Harmonization Method for Fine-Grained Ship Classification 2024-06-17. 16, 2192. [CrossRef]
- Yi, Y.; You, Y.; Li, C.; Zhou, W. EFM-Net: An Essential Feature Mining Network for Target Fine-Grained Classification in Optical Remote Sensing Images 2023. 61, 1–16. [CrossRef]
- Zhao, W.; Tong, T.; Yao, L.; Liu, Y.; Xu, C.; He, Y.; Lu, H. Feature Balance for Fine-Grained Object Classification in Aerial Images 2022. 60, 1–13. [CrossRef]
- Chen, D.; Tu, W.; Cao, R.; Zhang, Y.; He, B.; Wang, C.; Shi, T.; Li, Q. A hierarchical approach for fine-grained urban villages recognition fusing remote and social sensing data 2022-02-01. 106, 102661. [CrossRef]
- Wu, H.; Nie, J.; He, Z.; Zhu, Z.; Gao, M. One-Shot Multiple Object Tracking in UAV Videos Using Task-Specific Fine-Grained Features 2022-01. 14, 3853. Publisher: Multidisciplinary Digital Publishing Institute, . [CrossRef]
- Jiang, C.; Ren, H.; Li, F.; Hong, Z.; Huo, H.; Zhang, J.; Xin, J. Object detection from aerial multi-angle thermal infrared remote sensing images: Dataset and method 2025-10-01. 228, 438–452. [CrossRef]
- Luo, R.; He, Q.; Zhao, L.; Zhang, S.; Kuang, G.; Ji, K. Geospatial Contextual Prior-Enabled Knowledge Reasoning Framework for Fine-Grained Aircraft Detection in Panoramic SAR Imagery 2024. 62, 1–13. [CrossRef]
- Chen, Y.; Huang, J.; Sun, Z.; Xiong, S.; Lu, X. Thread the Needle: Cues-Driven Multiassociation for Remote Sensing Cross-Modal Retrieval 2024. 62, 1–13. [CrossRef]
- Zhao, Q.; Lyu, S.; Li, Y.; Ma, Y.; Chen, L. MGML: Multigranularity Multilevel Feature Ensemble Network for Remote Sensing Scene Classification 2023-05. 34, 2308–2322. [CrossRef]
- Guo, W.; Li, S.; Yang, J.; Zhou, Z.; Liu, Y.; Lu, J.; Kou, L.; Zhao, M. Remote Sensing Image Scene Classification by Multiple Granularity Semantic Learning 2022. 15, 2546–2562. [CrossRef]
- Wang, S.; Guan, Y.; Shao, L. Multi-Granularity Canonical Appearance Pooling for Remote Sensing Scene Classification 2020. 29, 5396–5407. [CrossRef]
- Bai, L.; Liu, Q.; Li, C.; Ye, Z.; Hui, M.; Jia, X. Remote Sensing Image Scene Classification Using Multiscale Feature Fusion Covariance Network With Octave Convolution 2022. 60, 1–14. [CrossRef]
- Miao, W.; Geng, J.; Jiang, W. Multigranularity Decoupling Network With Pseudolabel Selection for Remote Sensing Image Scene Classification 2023. 61, 1–13. [CrossRef]
- Ye, Z.; Zhang, Y.; Zhang, J.; Li, W.; Bai, L. A Multiscale Incremental Learning Network for Remote Sensing Scene Classification 2024. 62, 1–15. [CrossRef]
- Niu, B.; Pan, Z.; Chen, K.; Hu, Y.; Lei, B. Open Set Domain Adaptation via Instance Affinity Metric and Fine-Grained Alignment for Remote Sensing Scene Classification 2023. 20, 1–5. [CrossRef]
- Li, Y.; Li, Z.; Su, A.; Wang, K.; Wang, Z.; Yu, Q. Semisupervised Cross-Domain Remote Sensing Scene Classification via Category-Level Feature Alignment Network 2024. 62, 1–14. [CrossRef]
- Wang, Y.; Shu, Z.; Feng, Y.; Liu, R.; Cao, Q.; Li, D.; Wang, L. Enhancing Cross-Domain Remote Sensing Scene Classification by Multi-Source Subdomain Distribution Alignment Network 2025-01. 17, 1302. Publisher: Multidisciplinary Digital Publishing Institute, . [CrossRef]
- Zhu, P.; Zhang, X.; Han, X.; Cheng, X.; Gu, J.; Chen, P.; Jiao, L. Cross-Domain Classification Based on Frequency Component Adaptation for Remote Sensing Images 2024-01. 16, 2134. Publisher: Multidisciplinary Digital Publishing Institute, . [CrossRef]
- Xiao, R.; Wang, Y.; Tao, C. Fine-Grained Road Scene Understanding From Aerial Images Based on Semisupervised Semantic Segmentation Networks 2022. 19, 1–5. [CrossRef]
- Li, Y.; Kong, D.; Zhang, Y.; Tan, Y.; Chen, L. Robust deep alignment network with remote sensing knowledge graph for zero-shot and generalized zero-shot remote sensing image scene classification 2021-09-01. 179, 145–158. [CrossRef]
- Li, Z.; Xu, W.; Yang, S.; Wang, J.; Su, H.; Huang, Z.; Wu, S. A Hierarchical Graph-Enhanced Transformer Network for Remote Sensing Scene Classification 2024. 17, 20315–20330. [CrossRef]
- Xu, K.; Deng, P.; Huang, H. Vision Transformer: An Excellent Teacher for Guiding Small Networks in Remote Sensing Image Scene Classification 2022. 60, 1–15. [CrossRef]
- Shi, C.; Ding, M.; Wang, L.; Pan, H. Learn by Yourself: A Feature-Augmented Self-Distillation Convolutional Neural Network for Remote Sensing Scene Image Classification 2023-01. 15, 5620. Publisher: Multidisciplinary Digital Publishing Institute, . [CrossRef]
- Wang, G.; Chen, H.; Chen, L.; Zhuang, Y.; Zhang, S.; Zhang, T.; Dong, H.; Gao, P. P2FEViT: Plug-and-Play CNN Feature Embedded Hybrid Vision Transformer for Remote Sensing Image Classification 2023-01. 15, 1773. Publisher: Multidisciplinary Digital Publishing Institute, . [CrossRef]
- Solomon, A.A.; Agnes, S.A. MSCAC: A Multi-Scale Swin–CNN Framework for Progressive Remote Sensing Scene Classification 2024-09. 4, 462–480. Publisher: Multidisciplinary Digital Publishing Institute, . [CrossRef]
- Cheng, G.; Xie, X.; Han, J.; Guo, L.; Xia, G.S. Remote Sensing Image Scene Classification Meets Deep Learning: Challenges, Methods, Benchmarks, and Opportunities 2020. 13, 3735–3756. [CrossRef]
- Thapa, A.; Horanont, T.; Neupane, B.; Aryal, J. Deep Learning for Remote Sensing Image Scene Classification: A Review and Meta-Analysis 2023-01. 15, 4804. Publisher: Multidisciplinary Digital Publishing Institute, . [CrossRef]
- Yu, D.; Xu, Q.; Guo, H.; Zhao, C.; Lin, Y.; Li, D. An Efficient and Lightweight Convolutional Neural Network for Remote Sensing Image Scene Classification 2020-01. 20, 1999. Publisher: Multidisciplinary Digital Publishing Institute, . [CrossRef]
- Yu, D.; Xu, Q.; Guo, H.; Zhao, C.; Lin, Y.; Li, D. An Efficient and Lightweight Convolutional Neural Network for Remote Sensing Image Scene Classification 2020-01. 20, 1999. Publisher: Multidisciplinary Digital Publishing Institute, . [CrossRef]
- Xiong, W.; Xiong, Z.; Cui, Y. A Confounder-Free Fusion Network for Aerial Image Scene Feature Representation 2022. 15, 5440–5454. [CrossRef]
- Yuan, Z.; Zhang, W.; Tian, C.; Rong, X.; Zhang, Z.; Wang, H.; Fu, K.; Sun, X. Remote Sensing Cross-Modal Text-Image Retrieval Based on Global and Local Information 2022. 60, 1–16. [CrossRef]
- Hu, G.; Wen, Z.; Lv, Y.; Zhang, J.; Wu, Q. Global–Local Information Soft-Alignment for Cross-Modal Remote-Sensing Image–Text Retrieval 2024. 62, 1–15. [CrossRef]
- Zheng, F.; Wang, X.; Wang, L.; Zhang, X.; Zhu, H.; Wang, L.; Zhang, H. A Fine-Grained Semantic Alignment Method Specific to Aggregate Multi-Scale Information for Cross-Modal Remote Sensing Image Retrieval 2023-01. 23, 8437. Publisher: Multidisciplinary Digital Publishing Institute, . [CrossRef]
- Chen, Y.; Huang, J.; Li, X.; Xiong, S.; Lu, X. Multiscale Salient Alignment Learning for Remote-Sensing Image–Text Retrieval 2024. 62, 1–13. [CrossRef]
- Yuan, Z.; Zhang, W.; Fu, K.; Li, X.; Deng, C.; Wang, H.; Sun, X. Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote Sensing Image Retrieval 2022. 60, 1–19, [2204.09868 [cs]]. [CrossRef]
- Cheng, Q.; Zhou, Y.; Huang, H.; Wang, Z. Multi-Attention Fusion and Fine-Grained Alignment for Bidirectional Image-Sentence Retrieval in Remote Sensing 2022-08. 9, 1532–1535. [CrossRef]
- Yang, L.; Feng, Y.; Zhou, M.; Xiong, X.; Wang, Y.; Qiang, B. A Jointly Guided Deep Network for Fine-Grained Cross-Modal Remote Sensing Text–Image Retrieval 2023-09-15. 32, 2350221. Publisher: World Scientific Publishing Co., . [CrossRef]
- Cheng, Q.; Zhou, Y.; Fu, P.; Xu, Y.; Zhang, L. A Deep Semantic Alignment Network for the Cross-Modal Image-Text Retrieval in Remote Sensing 2021. 14, 4284–4297. [CrossRef]
- Xiu, D.; Ji, L.; Geng, X.; Wu, Y. RSITR-FFT: Efficient Fine-Grained Fine-Tuning Framework With Consistency Regularization for Remote Sensing Image-Text Retrieval 2024. 21, 1–5. [CrossRef]
- Zhou, Z.; Feng, Y.; Qiu, A.; Duan, G.; Zhou, M. Fine-Grained Information Supplementation and Value-Guided Learning for Remote Sensing Image-Text Retrieval 2024. 17, 19194–19210. [CrossRef]
- Sun, T.; Zheng, C.; Li, X.; Gao, Y.; Nie, J.; Huang, L.; Wei, Z. Strong and Weak Prompt Engineering for Remote Sensing Image-Text Cross-Modal Retrieval 2025. 18, 6968–6980. [CrossRef]
- Ning, H.; Wang, S.; Lei, T.; Cao, X.; Dou, H.; Zhao, B.; Nandi, A.K.; Radeva, P. Representation discrepancy bridging method for remote sensing image-text retrieval 2025-10-14. 650, 130915. [CrossRef]
- Chen, Y.; Huang, J.; Xiong, S.; Lu, X. Integrating Multisubspace Joint Learning With Multilevel Guidance for Cross-Modal Retrieval of Remote Sensing Images 2024. 62, 1–17. [CrossRef]
- Mao, Y.Q.; Jiang, Z.; Liu, Y.; Zhang, Y.; Qi, K.; Bi, H.; He, Y. FRORS: An Effective Fine-Grained Retrieval Framework for Optical Remote Sensing Images 2025. 18, 7406–7419. [CrossRef]
- Huang, J.; Feng, Y.; Zhou, M.; Xiong, X.; Wang, Y.; Qiang, B. Deep Multiscale Fine-Grained Hashing for Remote Sensing Cross-Modal Retrieval 2024. 21, 1–5. [CrossRef]
- Yu, H.; Yao, F.; Lu, W.; Liu, N.; Li, P.; You, H.; Sun, X. Text-Image Matching for Cross-Modal Remote Sensing Image Retrieval via Graph Neural Network 2023. 16, 812–824. [CrossRef]
- Pan, J.; Ma, Q.; Bai, C. Reducing Semantic Confusion: Scene-aware Aggregation Network for Remote Sensing Cross-modal Retrieval. In Proceedings of the Proceedings of the 2023 ACM International Conference on Multimedia Retrieval. Association for Computing Machinery, 2023-06-12, ICMR ’23, pp. 398–406. [CrossRef]
- Yang, B.; Wang, C.; Ma, X.; Song, B.; Liu, Z.; Sun, F. Zero-Shot Sketch-Based Remote-Sensing Image Retrieval Based on Multi-Level and Attention-Guided Tokenization. Remote Sensing 2024, 16, 1653.
- Liu, Y.; Dang, Y.; Qi, H.; Han, J.; Shao, L. Zero-shot sketch-based remote sensing image retrieval based on cross-modal fusion. Neural Networks 2025, p. 107796.







| Dataset Name | Resolution | Content | Categories | Total Images | Source |
|---|---|---|---|---|---|
| FGSCR-42 [14] | 0.1-4.5 | Ship | 42 | 9320 | GoogleEarth, ISPRS, GanFen etc. |
| FGSD [23] | 0.3-2 | Ship | 43 | 4736 | GoogleEarth |
| ShipRSImageNet [15] | 0.12-6 | Ship | 50 | 3435 | WorldView-3, GaoFen-2 etc. |
| MFBFS [16] | 1-4 | Building | 3 | 11005 | GaoFen-2 |
| UBC [24] | 0.5-0.8 | Building | 61 | 800 | SuperView, GaoFen-2 |
| UBC-v2 [25] | 0.5-1 | Building | 12 | 11336 | SuperView, GaoFen-2, GaoFen-3 |
| DFC2023 [26] | 0.5-1 | Building | 12 | 300k | SuperView, GaoFen-2, GaoFen-3 |
| MTARSI [27] | 0.3-2 | Aircraft | 20 | 9598 | GoogleEarth |
| MAR20 [28] | 0.3-2 | Aircraft | 20 | 3842 | GoogleEarth |
| Air planes, Ships | |||||
| FAIR1M [29] | 0.3-0.8 | Vehicles, Courts | 37 | 15000 | Gaofen, GoogleEarth |
| Road | |||||
| OpenEarthSensing [30] | 0.3-10 | Objects and Scenes | 189 | 157674 | Different Public Datasets |
| TREE [10] | 0.68 | Tree Species | 12 | - | LiCHy Hyperspectral system |
| Belgium Data [11] | 0.68 | Tree Species | 7 | 1450 | LiCHy Hyperspectral system |
| FUSU [12] | 0.2-0.5 | Land Use Change | 17 | 62752 | Google Earth, Sentinel |
| MEET [22] | 2025 | Scene | 80 | 1033778 | OpenStreetMap |
| NWPU [18] | 0.2-30 | Scene | 45 | 31500 | GoogleEarth |
| AID [17] | 0.5-8 | Scene | 30 | 10000 | GoogleEarth |
| RSD46-WHU [31] | 0.5-2 | Scene | 46 | 117000 | GoogleEarth |
| MLRSN [20] | 0.1-10 | Scene | 46 | 109161 | GoogleEarth |
| Million-AID* [21] | 0.5-153 | Scene | 51 | 10000 | GoogleEarth |
| PatternNet [19] | 0.06-4.7 | Scene | 38 | 30400 | Different Public Datasets |
| OPTIMAL-31 [32] | - | Scene | 31 | 1860 | GoogleEarth, Bing maps |
| RSI-CB256 [33] | 0.3-3 | Scene | 35 | 24000 | GoogleEarth, Bing maps |
| RSI-CB128 [33] | 0.3-3 | Scene | 45 | 36000 | GoogleEarth, Bing maps |
| SR-RSKG [34] | 0.2-30 | Scene | 70 | 56000 | GoogleEarth etc. |
| Multiscene [35] | 0.3-0.6 | Scene | 36 | 100000 | GoogleEarth, OpenStreetMap |
| WH-MAVS [36] | 1.2 | Scene | 14 | 47137 | GoogleEarth |
| Method | Backbone (+FPN) |
RPN | RoIAlign | Bbox Cls/Reg | Mask Branch | Purpose | Reference |
|---|---|---|---|---|---|---|---|
| EIRNet | × | × | Bidirectional feature fusion via DFF-Net; Optimize proposals with Mask-RPN (reuse attention mask); Mine interclass relations for ship fine-grained Cls |
[67] | |||
| MRAN | Fuse RGB/multispectral/LiDAR features; Proposal generation via attention scores; Optimize RoI sampling for small trees; Multisource feature-driven Cls/Reg; Refine tree canopy segmentation |
[68] | |||||
| PCLDet | × | Prototype learning for fine-grained features; Class-balanced sampler (CBS) for long-tail data; ProtoCL loss for Bbox Cls/Reg; Prototype constraint for Mask segmentation |
[61] | ||||
| SAM (Shape-Aware Model) | Shape-aware Conv for large-aspect-ratio ships; Dynamic anchor adjustment; RoI sampling optimization for deformed ship parts; Shape loss for Bbox fitting; Shape-constrained Mask for ship-background distinction |
[69] | |||||
| HCP-Mask-RCNN | Frequency-aware (FARS) module for detail features; Fine-grained proposal prioritization; RoI alignment with frequency features; Coarse-fine hierarchy (HCP) for Cls; Frequency-guided Mask for fine structures |
[70] | |||||
| Oriented R-CNN | × | × | Oriented feature enhancement via FPN; Oriented proposal generation; Geospatial object localization/Cls; Serve as teacher network for knowledge distillation |
[71] | |||
| ISCL-Mask-RCNN | × | × | × | Contrastive learning (CLM) to widen interclass distance; Refined instance switching (ReIS) for class imbalance; Improve airplane fine-grained detection (HBB/OBB) |
[72] | ||
| PETDet | × | Anchor-free QOPN for high-quality proposals; Bilinear channel fusion (BCFN) for RoI features; Adaptive recognition loss (ARL) for Cls/Reg; Focus on fine-grained target distinction |
[73] | ||||
| SFRNet | SC-Former for spatial-channel interaction; OR-Former for rotation-sensitive features; Multi-RoI loss (MRL) for Cls; Separate feature refinement for Cls/segmentation |
[74] | |||||
| MGANet | × | Local-global alignment (LAM) for ship features; Multigranularity self-attention (MSM) for fusion; RoIAlign optimization for local ship parts; Improve dense ship fine-grained Cls/Reg |
[75] | ||||
| FineShipNet | × | Blend synchronization module for feature reuse; Polarized feature focusing for task decoupling; Adaptive harmony anchor labeling; RoIAlign for ship discriminative features (Cls/Reg) |
[76] | ||||
| HMS-Net | × | Multiscale region feature re-extraction; Top-down feature fusion with guidance; Hierarchical loss for interclass relations (Cls); RoIAlign for ship fine-grained features |
[77] | ||||
| GFA-Net | Graph focusing process (GFP) for structural features; Graph aggregation network (GAN) for node weight; RoIAlign for invariant structure features (Cls/Reg); Mask segmentation for object structure preservation |
[78] | |||||
| DIMA | FARS module for frequency-domain features; Hierarchical classification (HCP) for Cls; RoI alignment with frequency details; Mask refinement for fine target structures |
[70] |
| Method | Input Stage |
Backbone | Neck | Head | Purpose | Reference |
|---|---|---|---|---|---|---|
| FGA-YOLO | × | Aggregate multi-layer features to enhance multi-scale information; Extract key discriminative features to improve fine-grained recognition; Alleviate imbalance between easy/hard samples via EMA Slide Loss |
[97] | |||
| SR-YOLO | × | Extract small-target fine-grained features via SR-Conv module; Enhance small-target feature fusion with bidirectional FPN; Improve detection accuracy via Normalized Wasserstein Distance Loss |
[98] | |||
| IF-YOLO | × | × | Preserve small-target intrinsic features via IPFA module; Suppress conflicting information with CSFM; Fuse multi-scale features via FGAFPN |
[101] | ||
| WDFA-YOLOX | × | Compensate SAR fine-grained feature loss via WSPP module; Enhance small-ship features with GLFAE; Improve bounding-box regression via Chebyshev distance-GIoU Loss |
[99] | |||
| Related-YOLO | × | × | Model ship component geometric relationships via relational attention; Adapt to rotated ships with deformable convolution; Optimize anchors via hierarchical clustering |
[103] | ||
| YOLOv5+CAM | × | Capture key regions via CAM attention module; Fuse multi-scale features with CAM-FPN; Enhance training via coarse-grained judgment + background supervision |
[100] | |||
| FiFoNet | × | × | Capture global-local context via GLCC module; Select valid multi-scale features to block redundant information; Improve small-target detection in UAV images |
[102] | ||
| FD-YOLOv8 | × | × | Preserve aircraft local details via local feature module; Enhance local-global interaction via focus modulation; Improve fine-grained accuracy in complex backgrounds |
[104] | ||
| YOLOX (GTDet) | × | Adapt to oriented targets via GCOTA label assignment; Improve angle prediction via DLAAH; Enhance localization via anchor-free detection |
[105] | |||
| DEDet | × | × | Restore nighttime details via FPP module; Filter background interference via progressive filtering; Improve nighttime UAV target detection |
[95] | ||
| MFL | × | × | Realize SR-OD mutual feedback via MFL closed-loop; Focus on ROI details via FROI module; Narrow target feature differences via MSOI |
[96] | ||
| InterMamba | × | Capture long-range dependencies via VMamba backbone; Fuse multi-scale features via cross-VSSM; Optimize dense detection via UIL loss |
[106] | |||
| Improved YOLOv7-Tiny | × | × | Construct diverse remote sensing aircraft dataset; Apply multi-scale/rotation augmentation to enrich input samples |
[90] | ||
| Lightweight FE-YOLO | × | Preprocess input data to highlight small-target fine-grained features; Reduce input noise interference via similarity-based channel screening; Optimize input feature distribution for remote sensing scenarios |
[91] | |||
| YOLOv8 (G-HG) | × | Adjust input feature resolution to match multi-scale remote sensing targets; Retain fine-grained details in input via redundant feature map sampling; Optimize input data utilization for complex background scenarios |
[92] | |||
| YOLO-RS | Adopt context-aware input sampling to focus on crop fine-grained regions; Balance input class distribution via AC mix module |
[93] | ||||
| YOLOX-DW | × | × | Apply adaptive sampling to balance fine-grained class distribution in input; Optimize input sample selection to avoid rare class underrepresentation |
[94] |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).