Sort by
Integrating Frequency-Spatial Features for Energy-Efficient OPGW Target Recognition in UAV-Assisted Mobile Monitoring
Lin Huang
,Xubin Ren
,Daiming Qu
,Lanhua Li
,Jing Xu
Posted: 05 December 2025
SIFT-SNN for Traffic-Flow Infrastructure Safety: A Real-Time Context-Aware Anomaly Detection Framework
Munish Rathee
,Boris Bačić
,Maryam Doborjeh
Posted: 03 December 2025
Linear-Region-Based Contour Tracking for Edge Images
Erick Huitrón-Ramírez
,Leonel G. Corona-Ramírez
,Diego Jiménez-Badillo
Posted: 02 December 2025
LBA-Net: Lightweight Boundary-Aware Network for Efficient Breast Ultrasound Image Segmentation
Ye Deng
,Meng Chen
,Jieguang Liu
,Qi Cheng
,Xiaopeng Xu
,Yali Qu
Breast ultrasound segmentation is challenged by strong noise, low contrast, and ambiguous lesion boundaries. Although deep models achieve high accuracy, their heavy computational cost limits deployment on portable ultrasound devices. In contrast, lightweight networks often struggle to preserve fine boundary details. To address this gap, we propose the lightweight boundary-aware network. A MobileNetV3-based encoder with the atrous spatial hyramid pooling is integrated for efficient multi-scale representation learning. The applied the lightweight boundary-aware block uses an adaptive fusion to combine efficient channel attention and depthwise spatial attention to enhance discriminative capability with minimal computational overhead. A boundary-guided dual-head decoding scheme injects explicit boundary priors and enforces boundary consistency to sharpen and stabilize margin delineation. Experiments on curated BUSI* and BUET* datasets demonstrate that the proposed network achieves 82.8% Dice, 38 px HD95, and real-time inference speeds (123 FPS GPU / 19 FPS CPU) using only 1.76M parameters. They show that this proposed network can offer a highly favorable balance between accuracy and efficiency.
Breast ultrasound segmentation is challenged by strong noise, low contrast, and ambiguous lesion boundaries. Although deep models achieve high accuracy, their heavy computational cost limits deployment on portable ultrasound devices. In contrast, lightweight networks often struggle to preserve fine boundary details. To address this gap, we propose the lightweight boundary-aware network. A MobileNetV3-based encoder with the atrous spatial hyramid pooling is integrated for efficient multi-scale representation learning. The applied the lightweight boundary-aware block uses an adaptive fusion to combine efficient channel attention and depthwise spatial attention to enhance discriminative capability with minimal computational overhead. A boundary-guided dual-head decoding scheme injects explicit boundary priors and enforces boundary consistency to sharpen and stabilize margin delineation. Experiments on curated BUSI* and BUET* datasets demonstrate that the proposed network achieves 82.8% Dice, 38 px HD95, and real-time inference speeds (123 FPS GPU / 19 FPS CPU) using only 1.76M parameters. They show that this proposed network can offer a highly favorable balance between accuracy and efficiency.
Posted: 01 December 2025
UCA-Net: A Transformer-Based U-Shaped Underwater Enhancement Network with Compound Attention Mechanism
Cheng Yu
,Jian Zhou
,Lin Wang
,Guizhen Liu
,Zhongjun Ding
Images captured underwater frequently suffer from color casts, blurring, and distortion, which are mainly attributable to the unique optical characteristics of water. Although conventional UIE methods rooted in physics are available, their effectiveness is often constrained, particularly in challenging aquatic and illumination conditions. More recently, deep learning has become a leading paradigm for UIE, recognized for its superior performance and operational efficiency. This paper proposes UCA-Net, a lightweight CNN-Transformer hybrid network. It incorporates multiple attention mechanisms and utilizes composite attention to effectively enhance textures, reduce blur, and correct color. A novel adaptive sparse self-attention module is introduced to jointly restore global color consistency and fine local details. The model employs a U-shaped encoder-decoder architecture with three-stage up- and down-sampling, facilitating multi-scale feature extraction and global context fusion for high-quality enhancement. Experimental results on multiple public datasets demonstrate UCA-Net’s superior performance, achieved with fewer parameters and lower computational cost. Its effectiveness is further validated by improvements in various downstream image tasks.
Images captured underwater frequently suffer from color casts, blurring, and distortion, which are mainly attributable to the unique optical characteristics of water. Although conventional UIE methods rooted in physics are available, their effectiveness is often constrained, particularly in challenging aquatic and illumination conditions. More recently, deep learning has become a leading paradigm for UIE, recognized for its superior performance and operational efficiency. This paper proposes UCA-Net, a lightweight CNN-Transformer hybrid network. It incorporates multiple attention mechanisms and utilizes composite attention to effectively enhance textures, reduce blur, and correct color. A novel adaptive sparse self-attention module is introduced to jointly restore global color consistency and fine local details. The model employs a U-shaped encoder-decoder architecture with three-stage up- and down-sampling, facilitating multi-scale feature extraction and global context fusion for high-quality enhancement. Experimental results on multiple public datasets demonstrate UCA-Net’s superior performance, achieved with fewer parameters and lower computational cost. Its effectiveness is further validated by improvements in various downstream image tasks.
Posted: 01 December 2025
Machine Vision and Deep Learning for Robotic Harvesting of Shiitake Mushrooms
Thomas Rowland
,Mark Hansen
,Melvyn Smith
,Lyndon Smith
Automation and computer vision are increasingly vital in modern agriculture, yet mushroom harvesting remains largely manual due to complex morphology and occluded growing environments. This study investigates the application of deep learning–based instance segmentation and keypoint detection to enable robotic harvesting of Lentinula edodes (shiitake) mushrooms. A dedicated RGB-D image dataset, the first open-access RGB-D dataset for mushroom harvesting, was created using a Microsoft Azure DK 3D camera under varied lighting and backgrounds. Two state-of-the-art segmentation models, YOLOv8-seg and Detectron2 Mask R-CNN, were trained and evaluated under identical conditions to compare accuracy, inference speed, and robustness. YOLOv8 achieved higher mean average precision (mAP = 67.9) and significantly faster inference, while Detectron2 offered comparable qualitative performance and greater flexibility for integration into downstream robotic systems. Experiments comparing RGB and RG-D inputs revealed minimal accuracy differences, suggesting that colour cues alone provide sufficient information for reliable segmentation. A proof-of-concept keypoint-detection model demonstrated the feasibility of identifying stem cut-points for robotic manipulation. These findings confirm that deep learning–based vision systems can accurately detect and localise mushrooms in complex environments, forming a foundation for fully automated harvesting. Future work will focus on expanding datasets, incorporating true four-channel RGB-D networks, and integrating perception with robotic actuation for intelligent agricultural automation.
Automation and computer vision are increasingly vital in modern agriculture, yet mushroom harvesting remains largely manual due to complex morphology and occluded growing environments. This study investigates the application of deep learning–based instance segmentation and keypoint detection to enable robotic harvesting of Lentinula edodes (shiitake) mushrooms. A dedicated RGB-D image dataset, the first open-access RGB-D dataset for mushroom harvesting, was created using a Microsoft Azure DK 3D camera under varied lighting and backgrounds. Two state-of-the-art segmentation models, YOLOv8-seg and Detectron2 Mask R-CNN, were trained and evaluated under identical conditions to compare accuracy, inference speed, and robustness. YOLOv8 achieved higher mean average precision (mAP = 67.9) and significantly faster inference, while Detectron2 offered comparable qualitative performance and greater flexibility for integration into downstream robotic systems. Experiments comparing RGB and RG-D inputs revealed minimal accuracy differences, suggesting that colour cues alone provide sufficient information for reliable segmentation. A proof-of-concept keypoint-detection model demonstrated the feasibility of identifying stem cut-points for robotic manipulation. These findings confirm that deep learning–based vision systems can accurately detect and localise mushrooms in complex environments, forming a foundation for fully automated harvesting. Future work will focus on expanding datasets, incorporating true four-channel RGB-D networks, and integrating perception with robotic actuation for intelligent agricultural automation.
Posted: 26 November 2025
The Impact of Responsive Design on User Experience
Kylychbek Parpiev
Posted: 26 November 2025
A Lightweight Degradation-Aware Framework for Robust Object Detection in Adverse Weather
Seungun Park
,Jiakang Kuai
,Hyunsu Kim
,Hyunseong Ko
,ChanSung Jung
,Yunsik Son
Posted: 26 November 2025
Dynamic Contextual Relational Alignment Network for Open-Vocabulary Video Visual Relation Detection
Linyu Lou
,Jiarong Mo
Posted: 25 November 2025
Markerless AR Registration Framework Using Multi-Modal Imaging for Orthopedic Surgical Guidance
James R. Whitmore
,Hui Lin
,Charlotte P. Evans
,David K. Mitchell
,Sophie L. Carter
Posted: 25 November 2025
Self-Calibrating Dual-Stream Network for Semi-Supervised 3D Medical Image Segmentation
Zeyuan Xun
Posted: 25 November 2025
Adaptive Spatiotemporal Condenser for Efficient Long-Form Video Question Answering
Bowen Nian
,Mingyu Tan
Posted: 25 November 2025
Context-Aware Knowledge Harmonization for Visual Question Reasoning
Lina Vermeersch
,Quentin Moor
,Elodie Fairchild
,Sarah Van Steen
Posted: 24 November 2025
FlareSat: A Benchmark Landsat 8 Dataset for Gas Flaring Segmentation in Oil and Gas Facilities
Osmary Camila Bortoncello Glober
,Ricardo Dutra da Silva
Posted: 24 November 2025
Motion Detection Development for Dynamic SLAM Based on Epipolar Geometry
Sedat Dikici
,Fikret Arı
Posted: 21 November 2025
Comparison of the Performance of Baseline CNN and Transfer Learning for Classifying Road Markings (Pedestrian Crossing, Speed Bump, Lane Divider)
Aibiike Omorova
Posted: 19 November 2025
MedSeg-Adapt: Clinical Query-Guided Adaptive Medical Image Segmentation via Generative Data Augmentation and Benchmarking
Gregory Yu
,Aaron Collins
,Ian Butler
Posted: 19 November 2025
A Survey on Video Generation Technologies, Applications, and Ethical Considerations
Kaiqi Chen
Posted: 19 November 2025
A Survey of Recent Advances in Adversarial Attack and Defense on Vision-Language Models
Md Iqbal Hossain
,Neeresh Kumar Perla
,Afia Sajeeda
,Siyu Xia
,Ming Shao
Posted: 18 November 2025
A Computer Vision and AI-Based System for Real-Time Detection and Diagnosis of Olive Leaf Diseases
Saud Saad Alqahtani
,Hossam El-Din Moustafa
,Elsaid A. Marzouk
,Ramadan Madi Ali Bakir
This paper introduces OLIVE-CAD, a novel Computer-Aided Diagnostics system designed for the real life, on-site detection of olive leaf diseases. The core of the system is a YOLOv12-based convolutional neural network model, which was trained on a comprehensive dataset of 11,315 olive leaf images. The images were categorized into 'Aculus', 'Scab', and 'Healthy,' with the dataset divided for training (70%), evaluation (20%), and real-world testing (10%). The key contribution of this work is the end-to-end integration of a custom, field-deployable Computer-Aided Diagnostics system. The trained YOLOv12 model achieved a mean average precision of 98.2% and mean average recall of 95.4%, while the model achieves class-specific evaluation precision of 95.3% and recall of 97.7% for 'Healthy' class; 97.9% precision and 88.3% of recall for 'Aculus' class; and precision of 94.3% and 95.4% of recall for 'Scab' class. OLIVE-CAD enables the storage of the immediate disease diagnostic outcomes to a predesigned database, providing a practical, deployable solution for agricultural applications. The research recommends an IoT-Based real-time central operation diagnostic monitoring system as future work.
This paper introduces OLIVE-CAD, a novel Computer-Aided Diagnostics system designed for the real life, on-site detection of olive leaf diseases. The core of the system is a YOLOv12-based convolutional neural network model, which was trained on a comprehensive dataset of 11,315 olive leaf images. The images were categorized into 'Aculus', 'Scab', and 'Healthy,' with the dataset divided for training (70%), evaluation (20%), and real-world testing (10%). The key contribution of this work is the end-to-end integration of a custom, field-deployable Computer-Aided Diagnostics system. The trained YOLOv12 model achieved a mean average precision of 98.2% and mean average recall of 95.4%, while the model achieves class-specific evaluation precision of 95.3% and recall of 97.7% for 'Healthy' class; 97.9% precision and 88.3% of recall for 'Aculus' class; and precision of 94.3% and 95.4% of recall for 'Scab' class. OLIVE-CAD enables the storage of the immediate disease diagnostic outcomes to a predesigned database, providing a practical, deployable solution for agricultural applications. The research recommends an IoT-Based real-time central operation diagnostic monitoring system as future work.
Posted: 17 November 2025
of 30