Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Interpretability Analysis and Attention Mechanism of Deep Learning-Based Microscopic Vision

Version 1 : Received: 11 April 2024 / Approved: 11 April 2024 / Online: 11 April 2024 (15:18:27 CEST)

How to cite: Xu, Z.; Zhao, X.; Wang, X.; Kong, Y.; Ren, T.; Wang, Y. Interpretability Analysis and Attention Mechanism of Deep Learning-Based Microscopic Vision. Preprints 2024, 2024040823. https://doi.org/10.20944/preprints202404.0823.v1 Xu, Z.; Zhao, X.; Wang, X.; Kong, Y.; Ren, T.; Wang, Y. Interpretability Analysis and Attention Mechanism of Deep Learning-Based Microscopic Vision. Preprints 2024, 2024040823. https://doi.org/10.20944/preprints202404.0823.v1

Abstract

Microscopic vision plays an important role in automated micro-assembly. However, some uncertain factors in the assembly process, such as occlusion and stains can lead to the mistakes of feature extraction. Herein, to solve the problem, the deep learning techniques are introduced into the feature recognition tasks, focusing on the attention mechanism and visualizing CNNs for DL-based microscopic vision. The main contributions are summarized as follows: The CBAM attention mechanism is combined with the YOLOv5 algorithm to improve the accuracy and robustness of feature extraction. The micropart feature occlusion experiment results show that at 70% occlusion degree, YOLOV5-CBAM can reach 97.9% mAP@0.5, which is 4.6% higher than the original one. Visualization analysis of DL-based model is conducted using Grad-CAM to make the decision result more transparent and avoid potential visual detection risks during assembly. The heatmap matching degree between GT area and high-light area is increased by 27.81% on average, which further verify the effectiveness of attention mechanism in micropart feature localization. Additionally, micropart surface stain and droplet quality classification models based on ResNet50 are trained to replace the manual sorting. The visual results are consistent with human eye discernment and judgement, confirming the reliability of parts and droplets sorting.

Keywords

microscopic vision; micro-assembly; convolutional neural network; attention mechanism; gradient-weighted class activation mapping

Subject

Engineering, Mechanical Engineering

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.