Submitted:
18 January 2025
Posted:
21 January 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
- Half-Window Filtering Technology: By proposing half-window filter(HWF), which leverages prior knowledge from SAR signal processing, the feature extraction process is enhanced, significantly improving the recognition performance for targets of different scales. While Transformer-based models have demonstrated strong performance in image processing, SAR target detection and electromagnetic signal detection, they have not yet fully integrated SAR signal processing priors, which this approach addresses.
- Auxiliary Feature Extractors: During training, auxiliary feature extractors are introduced to provide additional supervision signals, enhancing the model’s encoding and decoding capabilities. Traditional object detection algorithms have shown poor performance on SAR target detection tasks. The introduction of auxiliary feature extractors significantly improves this by providing more robust feature learning and enhancing overall recognition accuracy.
- Multi-Scale Adapter: The Multi-Scale Adapter dynamically constructs feature pyramids through upsampling and downsampling operations, enhancing multi-scale feature alignment and significantly improving detection accuracy for targets of varying sizes and scales.
2. Related Work
2.1. Detection Transformer
2.2. SAR Target Detection
2.3. Filtering Technology
2.4. Auxiliary Supervision
3. Method
3.1. Half-Window Filter
3.2. Multi-Scale Adapter

3.3. Auxiliary Feature Extractor
3.4. Loss Function and Optimization
is an indicator function that checks whether the predicted and ground truth boxes are matched.4. Experiments
4.1. Dataset and Details
4.2. Comparative Experiments
4.3. Ablation Study
5. Conclusion
Author Contributions
Conflicts of Interest
References
- Chen, C.; He, C.; Hu, C.; Pei, H.; Jiao, L. MSARN: A Deep Neural Network Based on an Adaptive Recalibration Mechanism for Multiscale and Arbitrary-Oriented SAR Ship Detection. IEEE Access 2019, p. 159262–159283. [CrossRef]
- Wang, Y.; Wang, C.; Zhang, H. Ship Classification in High-Resolution SAR Images Using Deep Learning of Small Datasets. Sensors 2018, p. 2929. [CrossRef]
- Zhao, Z.Q.; Zheng, P.; Xu, S.T.; Wu, X. Object Detection with Deep Learning: A Review. IEEE Transactions on Neural Networks and Learning Systems 2019, p. 3212–3232. [CrossRef]
- O’Shea, T.J.; Corgan, J.; Clancy, T.C. Convolutional Radio Modulation Recognition Networks. In Proceedings of the Proceedings of the International Conference on Engineering Applications of Neural Networks, 2016, pp. 213–226.
- Sturmel, N.; Daudet, L.; et al. Signal Reconstruction from STFT Magnitude: A State of the Art. In Proceedings of the Proceedings of the International Conference on Digital Audio Effects (DAFx), 2011, pp. 375–386.
- Girshick, R. Fast R-CNN. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Dec 2015. [CrossRef]
- Krasner, N. Optimal Detection of Digitally Modulated Signals. IEEE Transactions on Communications 1982, 30, 885–895.
- Rahman, M.H.; Sejan, M.A.S.; Aziz, M.A.; Baik, J.I.; Kim, D.S.; Song, H.K. Deep Learning Based Improved Cascaded Channel Estimation and Signal Detection for Reconfigurable Intelligent Surfaces-Assisted MU-MISO Systems. IEEE Transactions on Green Communications and Networking 2023, 7, 1515–1527. [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. In Proceedings of the Proceedings of the 31st International Conference on Neural Information Processing Systems (NeurIPS), Long Beach, CA, USA, 2017; pp. 5998–6008.
- Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-End Object Detection with Transformers. In Proceedings of the Proceedings of the European Conference on Computer Vision (ECCV), 2020, pp. 213–229.
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems 2015, 28.
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Jun 2014. [CrossRef]
- Zhu, X.; Su, W.; Lu, L.; Li, B.; Wang, X.; Dai, J. Deformable detr: Deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159 2020.
- Meng, D.; Chen, X.; Fan, Z.; Zeng, G.; Li, H.; Yuan, Y.; Sun, L.; Wang, J. Conditional detr for fast training convergence. In Proceedings of the Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 3651–3660.
- Liu, S.; Li, F.; Zhang, H.; Yang, X.; Qi, X.; Su, H.; Zhu, J.; Zhang, L. Dab-detr: Dynamic anchor boxes are better queries for detr. arXiv preprint arXiv:2201.12329 2022.
- Ding, S.G.; Nie, X.L.; Qiao, H.; Zhang, B. Online classification for SAR target recognition based on SVM and approximate convex hull vertices selection. National Key Laboratory of Management and Control for Complex Systems 2014, pp. 1473–1478.
- Yu, G.; Ying, X. Architecture design of deep convolutional neural network for SAR target recognition. Journal of Image and Graphics 2018.
- Kang, M.; Leng, X.; Lin, Z.; Ji, K. A modified faster R-CNN based on CFAR algorithm for SAR ship detection. IEEE 2017.
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun 2016. [CrossRef]
- Oh, J.; Kim, M. PeaceGAN: A GAN-Based Multi-Task Learning Method for SAR Target Image Generation with a Pose Estimator and an Auxiliary Classifier. Remote Sensing 2021, p. 3939. [CrossRef]
- Lu, D.; Cao, L.; Liu, H. Few-Shot Learning Neural Network for SAR Target Recognition. In Proceedings of the 2019 6th Asia-Pacific Conference on Synthetic Aperture Radar (APSAR), Nov 2019, p. 1–4. [CrossRef]
- Zhu, X.; Mori, H. Data augmentation using style transfer in SAR automatic target classification. In Proceedings of the Artificial Intelligence and Machine Learning in Defense Applications III, Sep 2021, p. 12. [CrossRef]
- Gong, Y.; Sbalzarini, I.F. Curvature Filters Efficiently Reduce Certain Variational Energies. IEEE Transactions on Image Processing 2017, p. 1786–1798. [CrossRef]
- Ibrahim, M.; Chen, K.; Brito-Loeza, C. A novel variational model for image registration using Gaussian curvature. Geometry, Imaging and Computing 2014, 1, 417–446. [CrossRef]
- Zhu, H.; Shu, H.; Zhou, J.; Bao, X.; Luo, L. Bayesian algorithms for PET image reconstruction with mean curvature and Gauss curvature diffusion regularizations. Computers in Biology and Medicine 2007, 37, 793–804. [CrossRef]
- Rudin, L.I.; Osher, S.; Fatemi, E. Nonlinear total variation based noise removal algorithms. Physica D: Nonlinear Phenomena 1992, p. 259–268. [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun 2016. [CrossRef]
- Caruana, R. Multitask Learning. Machine Learning 1997, p. 41–75. [CrossRef]
- Radford, A.; Kim, J.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Amanda, A.; Mishkin, P.; Clark, J.; et al. Learning Transferable Visual Models From Natural Language Supervision. Cornell University - arXiv,Cornell University - arXiv 2021.
- Isola, P.; Zhu, J.Y.; Zhou, T.; Efros, A.A. Image-to-Image Translation with Conditional Adversarial Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jul 2017. [CrossRef]
- Jaderberg, M.; Mnih, V.; Czarnecki, W.; Schaul, T.; Leibo, J.; Silver, D.; Kavukcuoglu, K. Reinforcement Learning with Unsupervised Auxiliary Tasks. arXiv: Learning,arXiv: Learning 2016.
- Zong, Z.; Song, G.; Liu, Y. DETRs with Collaborative Hybrid Assignments Training 2022.
- Zhang, J.; Tian, G.; Mu, Y.; Fan, W. Supervised deep learning with auxiliary networks. In Proceedings of the Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, Aug 2014. [CrossRef]
- Zhang, S.; Chi, C.; Yao, Y.; Lei, Z.; Li, S.Z. Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun 2020. [CrossRef]
- Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv: Computer Vision and Pattern Recognition,arXiv: Computer Vision and Pattern Recognition 2018.
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 2017, p. 1137–1149. [CrossRef]
- Wei, S.; Zeng, X.; Qu, Q.; Wang, M.; Su, H.; Shi, J. HRSID: A High-Resolution SAR Images Dataset for Ship Detection and Instance Segmentation. IEEE Access 2020, p. 120234–120254. [CrossRef]
- Yue, T.; Zhang, Y.; Liu, P.; Xu, Y.; Yu, C. A Generating-Anchor Network for Small Ship Detection in SAR Images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 2022, p. 7665–7676. [CrossRef]
- Gao, F.; Cai, C.; Tang, W.; He, Y. A compact and high-efficiency anchor-free network based on contour key points for SAR ship detection. IEEE Geoscience and Remote Sensing Letters 2024.
- Bai, L.; Yao, C.; Ye, Z.; Xue, D.; Lin, X.; Hui, M. Feature Enhancement Pyramid and Shallow Feature Reconstruction Network for SAR Ship Detection.
- Zhang, T.; Zhang, X.; Liu, C.; Shi, J.; Wei, S.; Ahmad, I.; Zhan, X.; Zhou, Y.; Pan, D.; Li, J.; et al. Balance learning for ship detection from synthetic aperture radar remote sensing imagery. ISPRS Journal of Photogrammetry and Remote Sensing 2021, p. 190–207. [CrossRef]
- Bai, L.; Yao, C.; Ye, Z.; Xue, D.; Lin, X.; Hui, M. A Novel Anchor-Free Detector Using Global Context-Guide Feature Balance Pyramid and United Attention for SAR Ship Detection. IEEE Geoscience and Remote Sensing Letters 2023, p. 1–5. [CrossRef]
- Chen, C.; Zeng, W.; Zhang, X.; Zhou, Y. CS n Net: A Remote Sensing Detection Network Breaking the Second-Order Limitation of Transformers with Recursive Convolutions.




| Parameter | Value |
|---|---|
| Total sample | 12500 |
| Sampling rate | 1024kHz |
| Bandwidth | [10, 20, 50, 100, 150]Hz |
| Signal Power | [0.1, 0.125, 0.25, 0.5, 1, 2, 4, 8, 10, 100] |
| Modulation methods | BPSK, QPSK, 8PSK, OQPSK, 16PSK, 16QAM, 64QAM, 256QAM, 2FSK, 4FSK |
| Noise Power | 1 |
| Size of spectrograms | (512, 512) |
| Ratio of dataset | train : validation : test = 8 : 1 : 1 |
| Method | mAP50 | mAP |
|---|---|---|
| Faster RCNN | 0.720 | 0.465 |
| RetinaNet | 0.789 | 0.536 |
| YOLOv6n | 0.882 | 0.628 |
| YOLOv7-tiny | 0.854 | 0.572 |
| YOLOv8n | 0.911 | 0.669 |
| YOLOv11n | 0.897 | 0.658 |
| Yue et al. [38]* | 0.911 | 0.665 |
| CPoints-Net [39]* | 0.905 | - |
| FEPS-Net [40]* | 0.907 | 0.657 |
| BL-Net [41]* | 0.867 | - |
| FBUA-Net [42]* | 0.903 | - |
| CS3Net [43]* | 0.912 | 0.66 |
| Refined Deformable-DETR | 0.902 | 0.682 |
| Method | Backbone | AP | AP50 | AP75 | APS | APM | APL |
|---|---|---|---|---|---|---|---|
| CenterNet | Resnet18 | 0.169 | 0.299 | 0.160 | 0.122 | 0.301 | 0.417 |
| Faster-Rcnn | Resnet50 | 0.139 | 0.164 | 0.156 | 0.037 | 0.417 | 0.551 |
| YOLOv3 | Darknet | 0.004 | 0.011 | 0.003 | 0.001 | 0.012 | 0.102 |
| CornerNet | HourglassNet | 0.207 | 0.315 | 0.217 | 0.162 | 0.335 | 0.421 |
| DETR | Resnet18 | 0.266 | 0.513 | 0.238 | 0.170 | 0.503 | 0.780 |
| DETR | Resnet50 | 0.086 | 0.178 | 0.076 | 0.051 | 0.201 | 0.770 |
| Deformable-DETR | Resnet50 | 0.257 | 0.405 | 0.273 | 0.225 | 0.372 | 0.997 |
| DAB-DETR | Resnet50 | 0.211 | 0.391 | 0.195 | 0.170 | 0.360 | 0.898 |
| DAB-DETR | Resnet50 | 0.228 | 0.422 | 0.213 | 0.171 | 0.431 | 0.915 |
| Conditional-DETR | Resnet50 | 0.112 | 0.223 | 0.093 | 0.078 | 0.221 | 0.788 |
| Refined Deformable-DETR | Resnet50 | 0.540 | 0.804 | 0.586 | 0.462 | 0.791 | 0.986 |
| Auxiliary Feature Extractor | HWF | AP | AP50 |
|---|---|---|---|
| YOLOv3 based AFE | ✓ | 0.682 | 0.902 |
| × | 0.619 | 0.887 | |
| Faster-RCNN based AFE | ✓ | 0.602 | 0.858 |
| × | 0.587 | 0.822 | |
| ATSS based AFE | ✓ | 0.677 | 0.882 |
| × | 0.652 | 0.837 |
| Auxiliary Feature Extractor | HWF | AP | AP50 |
|---|---|---|---|
| YOLOv3 based AFE | ✓ | 0.540 | 0.804 |
| × | 0.525 | 0.794 | |
| Faster-RCNN based AFE | ✓ | 0.527 | 0.782 |
| × | 0.499 | 0.664 | |
| ATSS based AFE | ✓ | 0.533 | 0.771 |
| × | 0.510 | 0.752 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).