Submitted:
02 February 2026
Posted:
03 February 2026
You are already at the latest version
Abstract
Keywords:
1. Introduction
- We propose a unified oriented detection framework, P2R-OBB, that simultaneously addresses multi scale feature loss and dynamic feature enhancement within a single architecture.
- We design a Dynamic RCBAM Module that enables adaptive, orientation aware feature modulation through a learnable alignment mechanism.
- We conduct comprehensive experiments on challenging benchmarks, showing that our method achieves a superior accuracy and complexity trade off, with significant performance gains on complex datasets, which demonstrates its effectiveness for practical maritime surveillance.
2. Related Work
2.1. Generic Object Detection
2.2. Ship Detection in Remote Sensing
2.3. Overview
2.4. P2 Feature Pyramid Enhancement Network
2.5. Dynamic RCBAM Module
2.5.1. Dynamic Rotation for Orientation Alignment
3. Experiments
3.1. Datasets and Evaluation Metrics
3.2. Implementation Details
3.3. Comparisons with Previous Methods
| Dataset | Model | (%) | |
|---|---|---|---|
| HRSID | YOLOv8-OBB | 88.9 | 11.6 |
| + CA [9] | 90.3 | 12.1 | |
| + ECA [10] | 89.5 | 11.8 | |
| Deformable DETR [15] | 87.2 | 78.4 | |
| Sparse DETR [16] | 88.5 | 61.2 | |
| P2R-OBB (Ours) | 92.5 | 12.3 | |
| SSDD+ | YOLOv8-OBB | 53.5 | 8.3 |
| P2R-OBB (Ours) | 50.8 | 12.5 | |
| HRSC2016 | YOLOv8-OBB | 89.8 | 8.3 |
| P2R-OBB (Ours) | 90.3 | 12.5 | |
| DOTA v1-ship | YOLOv8-OBB | 48.7 | 8.3 |
| P2R-OBB (Ours) | 59.8 | 12.5 |
| Pos. | Feat. Layer | Base Loss | Rank |
|---|---|---|---|
| P2 | Small (P2/4) | 1.0968 | 1 |
| P3 | Small-med (P3/8) | 1.1726 | 4 |
| P4 | Med (P4/16) | 1.1259 | 3 |
| P5 | Large (P5/32) | 1.1221 | 2 |
3.4. Ablation Study
4. Conclusion
References
- Wang, C.; Li, W.; Liu, X.; Zhang, L. A Comprehensive Survey on Oriented Object Detection in Remote Sensing Imagery. IEEE Transactions on Geoscience and Remote Sensing 2023, 61, 1–28. [Google Scholar] [CrossRef]
- Li, J.; Wang, Y.; Zhang, B.; Ghamisi, P. Lightweight Coordinate Attention for Real-Time Ship Detection in Remote Sensing Imagery. In Proceedings of the Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 2022; pp. 5678–5681. [Google Scholar] [CrossRef]
- Wu, Y.; Liu, Z.; Zhou, Z.; Li, W.; Zhang, H. HRSID: A High-Resolution SAR Images Dataset for Ship Detection and Instance Segmentation. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 2021, 14, 11013–11026. [Google Scholar] [CrossRef]
- Zhang, J.; Li, M.; Wang, H.; Su, H. Industrial Application of Ship Detection in Remote Sensing Imagery for Maritime Surveillance. IEEE Transactions on Intelligent Transportation Systems 2023, 24, 8901–8912. [Google Scholar] [CrossRef]
- Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv8: Evolution of Real-Time Object Detection. arXiv 2023, arXiv:2301.05085. [Google Scholar] [CrossRef]
- Ultralytics. Ultralytics YOLOv8, 2023. Version 8.0, Computer software.
- Chen, Y.; Jiang, M.; Li, P.; Ghamisi, P. Transformer-Based Methods for Remote Sensing Image Object Detection: A Survey. IEEE Transactions on Geoscience and Remote Sensing 2023, 61, 1–24. [Google Scholar] [CrossRef]
- Liu, S.; Chen, Y.; Zhang, W.; Li, H. Enhanced Feature Pyramid Network for Small Object Detection in Remote Sensing Images. Pattern Recognition Letters 2021, 152, 123–129. [Google Scholar] [CrossRef]
- Hou, Q.; Zhou, D.; Feng, J. Coordinate Attention for Efficient Mobile Network Design. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021; pp. 13713–13722. [Google Scholar] [CrossRef]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020; pp. 11534–11542. [Google Scholar] [CrossRef]
- Chen, X.; Ding, M.; Wang, J.; Li, J. Dynamic Convolution for Rotated Object Detection. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021; pp. 10012–10021. [Google Scholar] [CrossRef]
- Zhang, Y.; Liu, C.L.; Wang, M. Scalable Attention Module for Multi-Scale Object Detection in Remote Sensing Imagery. Pattern Recognition 2022, 128, 108668. [Google Scholar] [CrossRef]
- Fan, H.; Pang, J.; Cao, Y.; Li, G. Rotation-Aware Spatial Attention for Oriented Object Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023; pp. 7890–7899. [Google Scholar] [CrossRef]
- Wang, C.; Li, W.; Liu, X.; Zhang, L. Scale Distribution Analysis of Ship Targets in Remote Sensing Imagery. IEEE Geoscience and Remote Sensing Letters 2022, 19, 1–5. [Google Scholar] [CrossRef]
- Zhu, X.; Su, W.; Lu, L.; Li, B.; Wang, X.; Dai, J. Deformable DETR: Deformable Transformers for End-to-End Object Detection. International Journal of Computer Vision 2022, 129, 994–1010. [Google Scholar] [CrossRef]
- Li, G.; Zhang, X.; Sun, J. Sparse DETR: Efficient End-to-End Object Detection with Learnable Sparsity. arXiv 2023, arXiv:2303.06250. [Google Scholar] [CrossRef]
- Shen, F.; Ye, H.; Zhang, J.; Wang, C.; Han, X.; Wei, Y. Advancing Pose-Guided Image Synthesis with Progressive Conditional Diffusion Models. In Proceedings of the The Twelfth International Conference on Learning Representations, 2024. [Google Scholar]
- Shen, F.; Tang, J. Imagpose: A unified conditional framework for pose-guided person generation. Advances in neural information processing systems 2024, 37, 6246–6266. [Google Scholar]
- Shen, F.; Jiang, X.; He, X.; Ye, H.; Wang, C.; Du, X.; Li, Z.; Tang, J. Imagdressing-v1: Customizable virtual dressing. Proceedings of the Proceedings of the AAAI Conference on Artificial Intelligence 2025, Vol. 39, 6795–6804. [Google Scholar] [CrossRef]
- Shen, F.; Yu, J.; Wang, C.; Jiang, X.; Du, X.; Tang, J. IMAGGarment-1: Fine-Grained Garment Generation for Controllable Fashion Design. arXiv arXiv:2504.13176. [CrossRef]
- Shen, F.; Wang, C.; Gao, J.; Guo, Q.; Dang, J.; Tang, J.; Chua, T.S. Long-Term TalkingFace Generation via Motion-Prior Conditional Diffusion Model. In Proceedings of the Forty-second International Conference on Machine Learning.
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017; pp. 2117–2125. [Google Scholar]
- Li, Y.; Wang, X.; Zhang, L.; Chen, J. Bidirectional Feature Pyramid Network for Rotated Object Detection in Remote Sensing Imagery. IEEE Transactions on Geoscience and Remote Sensing 2022, 60, 1–14. [Google Scholar] [CrossRef]
- Xia, G.S.; Bai, X.; Ding, J.; Zhu, Z.; Belongie, S.; Luo, J.; Datcu, M.; Pelillo, M.; Zhang, L. DOTA: A Large-Scale Dataset for Object Detection in Aerial Images. IEEE Transactions on Pattern Analysis and Machine Intelligence 2020, 42, 2962–2978. [Google Scholar] [CrossRef]
- Zhang, J.; Li, W.; Wang, H.; Su, H. SSDD+: An Expanded SAR Ship Detection Dataset with Multi-Scenario and Multi-Sensor Characteristics. IEEE Geoscience and Remote Sensing Letters 2022, 19, 1–5. [Google Scholar] [CrossRef]
- Li, W.; Wang, H.; Li, Q.; Su, H. HRSC2016: A High-Resolution SAR Ship Detection Dataset. In Proceedings of the Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 2020; pp. 4330–4333. [Google Scholar] [CrossRef]
- Ma, J.; Shao, Z.; Ye, H.; Wang, Z.; Zhang, X.; Xue, N. Rotated Object Detection with Adaptive NMS and Oriented Bounding Box Evaluation. IEEE Transactions on Image Processing 2022, 31, 1939–1951. [Google Scholar] [CrossRef]
- Wang, C.; Li, W.; Liu, X.; Zhang, L. Rotated NMS: Efficient Non-Maximum Suppression for Oriented Object Detection. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022; pp. 9876–9885. [Google Scholar] [CrossRef]
- Zhang, Y.; Liu, C.L.; Wang, M. Extreme Scale Ship Detection in Remote Sensing Images Using Hierarchical Feature Fusion. In Proceedings of the Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 2022; pp. 7890–7893. [Google Scholar] [CrossRef]
- Zhang, H.; Wang, C.; Li, J.; Xu, F. Speckle Noise Suppression for SAR Ship Detection Using Attention-Guided Denoising Network. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 2023, 16, 3456–3468. [Google Scholar] [CrossRef]
- Chen, Y.; Jiang, M.; Li, P.; Ghamisi, P. Robustness Analysis of Object Detectors in Remote Sensing Imagery Under Adverse Conditions. IEEE Transactions on Geoscience and Remote Sensing 2023, 61, 1–16. [Google Scholar] [CrossRef]




| Dataset | Modality | #Images | #Instances | Key Characteristics |
|---|---|---|---|---|
| DOTA v1-ship [24] | Optical | 2,800 | >18,000 | Small, dense, arbitrary orientations |
| HRSC2016 [26] | Optical | 1,061 | 2,976 | High-resolution, extreme aspect ratios |
| HRSID[3] | SAR (Multi-pol.) | 1,160 | 2,456 | High-res, speckle noise, near/offshore |
| SSDD+ [25] | SAR (Dual-pol.) | 2,350 | >5,200 | Multi-scenario, cluttered backgrounds |
| Variant | GFLOPs | P | ||
|---|---|---|---|---|
| Baseline | 90.9 | 61.0 | 8.3 | — |
| + RCBAM | 91.0 | 61.3 | 8.3 | +0.01M |
| + P2-FPN | 92.2 | 62.5 | 12.4 | +2.95M |
| Full P2R-OBB | 92.5 | 62.8 | 12.5 | +2.96M |
| Pos. | Feat. Layer | Base Loss | Rank |
|---|---|---|---|
| P2 | Small (P2/4) | 1.0968 | 1 |
| P3 | Small-med (P3/8) | 1.1726 | 4 |
| P4 | Med (P4/16) | 1.1259 | 3 |
| P5 | Large (P5/32) | 1.1221 | 2 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).