Preprint
Article

This version is not peer-reviewed.

RHG-DETR: Riemannian Hyper-Graph Transformerwith Dynamic Receptive Fields for Detecting Special Targets in Degraded UAV Imagery

Submitted:

23 April 2026

Posted:

24 April 2026

You are already at the latest version

Abstract
Accurate detection of special targets in unmanned aerial vehicle (UAV) remote sensing imagery under complex degradation conditions remains a critical challenge for intelligent surveillance systems. Existing detectors exhibit significant performance degradation when confronted with composite degradation factors such as blur, rain, snow, fog, low illumination, strong light, and electromagnetic interference. To address this limitation, we propose RHG-DETR (Riemannian Hyper-Graph Detection Transformer), a novel detection framework for robust special target detection under multi-type degradation in UAV remote sensing imagery. Using RT-DETR as the baseline, three synergistic innovations are introduced at the backbone, neck, and encoder levels. The Dynamic Receptive-field Hyper-graph Attention Network (DRHANet) replaces the conventional ResNet backbone, employing anisotropic dynamic depthwise separable convolution and a Riemannian Hyper-graph Fusion (RHGF) mechanism to model high-order semantic topology dependencies among target components. The Bi-directional Weighted Adaptive Fusion Network (BWAFN) constructs a two-stage bidirectional feature pyramid with learnable scale contribution weights and a lightweight spatial compensation upsampler to maintain cross-scale semantic consistency under atmospheric degradation. The Adaptive Sparse Multi-scale Encoder with Dynamic normalization (ASMED) reconstructs the AIFI encoder module by introducing sparse window self-attention to suppress background interference, a spatial-gated feedforward fusion to preserve geometric topology constraints of target sub-components, and coordinated dynamic normalization modules to stabilize encoding under extreme illumination and electromagnetic interference. On a self-constructed special target dataset comprising tanks, multiple launch rocket systems, and soldiers under seven degradation types, RHG-DETR achieves an mAP50 of 78.5%, surpassing the RT-DETR baseline by 3.7%, while reducing GFLOPs and parameter count by 34.4% and 28.8%, respectively, at an inference speed of 84.2 FPS. Consistent improvements on VisDrone2019 and BDD100K further validate the cross-domain generalization capability of the proposed framework.
Keywords: 
;  ;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated