Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

An Explainable Vision Transformer Model Based White Blood Cells Classification and Localization

Version 1 : Received: 15 June 2023 / Approved: 15 June 2023 / Online: 15 June 2023 (08:42:51 CEST)

A peer-reviewed article of this Preprint also exists.

Katar, O.; Yildirim, O. An Explainable Vision Transformer Model Based White Blood Cells Classification and Localization. Diagnostics 2023, 13, 2459. https://doi.org/10.3390/diagnostics13142459 Katar, O.; Yildirim, O. An Explainable Vision Transformer Model Based White Blood Cells Classification and Localization. Diagnostics 2023, 13, 2459. https://doi.org/10.3390/diagnostics13142459

Abstract

Blood cell analysis is a crucial diagnostic process in medical practice. In particular, detecting white blood cells (WBCs) is essential for diagnosing of many diseases. The manual screening of blood films is a time-consuming and subjective process, which can lead to inconsistencies and errors. Therefore, automated detection of blood cells can improve the accuracy and efficiency of the screening process. In this study, an explainable Vision Transformer (ViT) model was proposed for the automatic detection of WBCs from blood films. The proposed model utilizes the self-attention mechanism to extract relevant features from the input images and leverages transfer learning by incorporating pre-trained model weights to improve its performance. The proposed model achieved a classification accuracy of 99.40% for five distinct types of WBCs and exhibited potential in reducing the time required for manual screening of blood films by pathologists. Upon examination of the misclassified test samples, it was observed that incorrect predictions were correlated with the presence or absence of granules in the cell samples. To validate this observation, the dataset was divided into two classes, namely Granulocytes and Agranulocytes, and a secondary training process was conducted. The resulting ViT model trained for binary classification achieved an accuracy of 99.70%, recall of 99.54%, precision of 99.32%, and F-1 score of 99.43% during the test phase. To ensure the reliability of the ViT model's multi-class classification of WBCs, the pixel areas that the model focuses on in its predictions are visualized through the Score-CAM algorithm.

Keywords

Vision Transformers; white blood cells; explainable AI models; deep learning; Score-CAM

Subject

Engineering, Other

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.