Chen, J.; Xu, K.; Ning, Y.; Jiang, L.; Xu, Z. CRTED: Few-Shot Object Detection via Correlation-RPN and Transformer Encoder–Decoder. Electronics2024, 13, 1856.
Chen, J.; Xu, K.; Ning, Y.; Jiang, L.; Xu, Z. CRTED: Few-Shot Object Detection via Correlation-RPN and Transformer Encoder–Decoder. Electronics 2024, 13, 1856.
Chen, J.; Xu, K.; Ning, Y.; Jiang, L.; Xu, Z. CRTED: Few-Shot Object Detection via Correlation-RPN and Transformer Encoder–Decoder. Electronics2024, 13, 1856.
Chen, J.; Xu, K.; Ning, Y.; Jiang, L.; Xu, Z. CRTED: Few-Shot Object Detection via Correlation-RPN and Transformer Encoder–Decoder. Electronics 2024, 13, 1856.
Abstract
Few-shot object detection (FSOD) aims to address the challenge of requiring a substantial amount of annotations for training in conventional object detection, which is very labor-intensive. However, existing few-shot methods achieve high precision with the sacrifice of time-consuming for exhaustive fine-tuning, or take poor performance in novel-class adaptation. We presume the major reason is that the valuable correlation feature among different categories is insufficiently exploited, hindering the generalization of knowledge from base to novel categories for object detection. In this paper, we propose Few-Shot object detection via Correlation-RPN and Transformer Encoder-Decoder (CRTED), a novel training network to learn object-relevant features of inter-class correlation and intra-class compactness while suppressing object-agnostic features in the background with limited annotated samples. And we also introduce a 4-way tuple-contrast training strategy to positively activate the training progress of our object detector. Experiments over two few-shot benchmarks (Pascal VOC, MS-COCO) demonstrate that, our proposed CRTED without further fine-tuning can achieve comparable performance with current state-of-the-art fine-tuned works. The codes and pre-trained models will be released.
Keywords
Few-shot object detection; Region proposal network; Transformer Encoder-Decoder; Training strategies
Subject
Computer Science and Mathematics, Computer Vision and Graphics
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.