Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

RS Transformer: A Two-Stage Region Proposal Using the Swin Transformer for Few-Shot Pest Detection in Automated Agricultural Monitoring Systems

Version 1 : Received: 16 October 2023 / Approved: 18 October 2023 / Online: 18 October 2023 (17:20:58 CEST)

A peer-reviewed article of this Preprint also exists.

Wu, T.; Shi, L.; Zhang, L.; Wen, X.; Lu, J.; Li, Z. RS Transformer: A Two-Stage Region Proposal Using Swin Transformer for Few-Shot Pest Detection in Automated Agricultural Monitoring Systems. Appl. Sci. 2023, 13, 12206. Wu, T.; Shi, L.; Zhang, L.; Wen, X.; Lu, J.; Li, Z. RS Transformer: A Two-Stage Region Proposal Using Swin Transformer for Few-Shot Pest Detection in Automated Agricultural Monitoring Systems. Appl. Sci. 2023, 13, 12206.

Abstract

Agriculture is pivotal in national economies, with pest detection significantly influencing food quality and quantity. Pest classification remains challenging in automated agriculture monitoring systems, exacerbated by the non-uniform pest scales and the scarcity of high-quality datasets. In this study, we constructed a pest dataset by acquiring domain-agnostic images from the Internet and resizing them to a standardized 299x299 pixel format. Additionally, we employed diffusion models to generate supplementary data. While Convolutional Neural Networks (CNNs) are prevalent for prediction and classification, they often lack effective global information integration and discriminative feature representation. To address these limitations, we propose the RS Transformer, an innovative model that combines elements like the Region Proposal Network, Swin Transformer, and ROI Align. Additionally, we introduce the Randomly Generated Stable Diffusion Dataset (RGSDD) to augment the availability of high-quality pest datasets. Extensive experimental evaluations demonstrate the superiority of our approach compared to both two-stage models (SSD and Faster R-CNN) and one-stage models (YOLOv3, YOLOv4, YOLOv5m, YOLOv8, and DETR). We rigorously assess performance using metrics such as mean Average Precision (mAP), F1Score, Recall, and mean Detection Time (mDT). Our research contributes to advancing pest detection methodologies in automated agriculture systems, promising improved food production and quality.

Keywords

Swin Transformer; Pest detection; Diffusion model; Feature extraction; Few-shot learning

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.