Submitted:
10 February 2026
Posted:
10 February 2026
You are already at the latest version
Abstract
Keywords:
Introduction
Materials and Methods
Study Population and Inclusion Criteria
Study Population and Selection Criteria
Imaging Protocol
Dataset and Preprocessing
Deep Learning Architecture: Gated Attention MIL
- 1.
- Instance-Level Feature Extraction: Each patient Pi_, is represented as a bag containing K MRI slices. A deep Convolutional Neural Network (CNN) was employed to transform the inflammatory tissue characteristics (such as intensity variations and structural distortions) present in the image to a low-dimensional feature vector (hik) [9]. The ResNet-18 model, pre-trained on ImageNet, was employed as the backbone architecture [10]. The fully connected layers of the model were removed, and a feature vector of dimension ( d = 512 ) was extracted for each slice:
- 2.
- Gated Attention Mechanism: The most critical step in detecting active sacroiliitis is identifying which slices in the MRI series show signs of inflammation. A Gated Attention Mechanism was employed to prevent healthy slices from negatively influencing the model’s decision.
- 3.
- Bag Representation and Classification: A single patient-level feature vector (zi) is obtained by aggregating all instance feature vectors weighted by their attention scores(ak):
Test-Time Augmentation (TTA)
Findings
Analysis of F1-Score Training Logs and Model Performance
Discussion
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Sieper, J., & Poddubnyy, D. (2017). Axial spondyloarthritis. The Lancet, 390(10089), 73-84. [CrossRef]
- Walsh, J. A., & Magrey, M. (2021). Clinical manifestations and diagnosis of axial spondyloarthritis. JCR: Journal of Clinical Rheumatology, 27(8), e547-e560. [CrossRef]
- Rudwaleit, M. V., Van Der Heijde, D., Landewé, R., Listing, J., Akkoc, N., Brandt, J., ... & Sieper, J. (2009). The development of Assessment of SpondyloArthritis international Society classification criteria for axial spondyloarthritis (part II): validation and final selection. Annals of the rheumatic diseases, 68(6), 777-783. [CrossRef]
- van den Berg, R., Lenczner, G., Thévenin, F., Claudepierre, P., Feydy, A., Reijnierse, M., ... & van Der Heijde, D. (2015). Classification of axial SpA based on positive imaging (radiographs and/or MRI of the sacroiliac joints) by local rheumatologists or radiologists versus central trained readers in the DESIR cohort. Annals of the rheumatic diseases, 74(11), 2016-2021. [CrossRef]
- Kepp, F. H., Huber, F. A., Wurnig, M. C., Mannil, M., Kaniewska, M., Guglielmi, R., ... & Guggenberger, R. (2021). Differentiation of inflammatory from degenerative changes in the sacroiliac joints by machine learning supported texture analysis. European Journal of Radiology, 140, 109755. [CrossRef]
- Faleiros, M. C., Nogueira-Barbosa, M. H., Dalto, V. F., Júnior, J. R. F., Tenório, A. P. M., Luppino-Assad, R., ... & de Azevedo-Marques, P. M. (2020). Machine learning techniques for computer-aided classification of active inflammatory sacroiliitis in magnetic resonance imaging. Advances in Rheumatology, 60(1), 25. [CrossRef]
- Bordner, A., Aouad, T., Medina, C. L., Yang, S., Molto, A., Talbot, H., ... & Feydy, A. (2023). A deep learning model for the diagnosis of sacroiliitis according to Assessment of SpondyloArthritis International Society classification criteria with magnetic resonance imaging. Diagnostic and Interventional Imaging, 104(7-8), 373-383. [CrossRef]
- Ilse, M., Tomczak, J., & Welling, M. (2018, July). Attention-based deep multiple instance learning. In International conference on machine learning (pp. 2127-2136). PMLR.
- Kim, H. E., Cosa-Linan, A., Santhanam, N., Jannesari, M., Maros, M. E., & Ganslandt, T. (2022). Transfer learning for medical image classification: a literature review. BMC medical imaging, 22(1), 69. [CrossRef]
- Talo, M., Yildirim, O., Baloglu, U. B., Aydin, G., & Acharya, U. R. (2019). Convolutional neural networks for multi-class brain disease detection using MRI images. Computerized Medical Imaging and Graphics, 78, 101673. [CrossRef]
- Hu, H., Ye, R., Thiyagalingam, J., Coenen, F., & Su, J. (2023). Triple-kernel gated attention-based multiple instance learning with contrastive learning for medical image analysis. Applied Intelligence, 53(17), 20311-20326. [CrossRef]
- Ilse, M., Tomczak, J. M., & Welling, M. (2020). Deep multiple instance learning for digital histopathology. In Handbook of Medical Image Computing and Computer Assisted Intervention (pp. 521-546). Academic Press.
- Dietterich, T. G., Lathrop, R. H., & Lozano-Pérez, T. (1997). Solving the multiple instance problem with axis-parallel rectangles. Artificial intelligence, 89(1-2), 31-71. [CrossRef]
- Wang, G., Li, W., Aertsen, M., Deprest, J., Ourselin, S., & Vercauteren, T. (2019). Aleatoric uncertainty estimation with test-time augmentation for medical image segmentation with convolutional neural networks. Neurocomputing, 338, 34-45. [CrossRef]
- Kandel, I., & Castelli, M. (2021). Improving convolutional neural networks performance for image classification using test time augmentation: a case study using MURA dataset. Health information science and systems, 9(1), 33. [CrossRef]
- Bressem, K. K., Adams, L., Proft, F., Hermann, K. G. A., Diekhoff, T., Spiller, L., ... & Poddubnyy, D. (2022). OP0152 A deep learnıng framework for mrı detectıon of actıve ınflammatory and structural changes ın the sacroılıac joınt consıstent wıth axıal spondyloarthrıtıs: an ınternatıonal collaboratıve study. Annals of the Rheumatic Diseases, 81, 98-99.
- Zhang, K., Liu, C., Pan, J., Zhu, Y., Li, X., Zheng, J., ... & Hong, G. (2024). Use of MRI-based deep learning radiomics to diagnose sacroiliitis related to axial spondyloarthritis. European Journal of Radiology, 172, 111347. [CrossRef]
- Bressem, K. K., Adams, L. C., Proft, F., Hermann, K. G. A., Diekhoff, T., Spiller, L., ... & Poddubnyy, D. (2022). Deep learning detects changes indicative of axial spondyloarthritis at MRI of sacroiliac joints. Radiology, 305(3), 655-665. [CrossRef]
- Liu, L., Zhong, R., Zhang, Y., Wan, H., Chen, S., Zhang, N., ... & Huang, R. (2025). Diagnosis of sacroiliitis through semi-supervised segmentation and radiomics feature analysis of MRI images. Journal of Magnetic Resonance Imaging, 62(2), 563-572. [CrossRef]
- Nicolaes, J., Tselenti, E., Aouad, T., López-Medina, C., Feydy, A., Talbot, H., ... & Dougados, M. (2025). Performance analysis of a deep-learning algorithm to detect the presence of inflammation in MRI of sacroiliac joints in patients with axial spondyloarthritis. Annals of the rheumatic diseases, 84(1), 60-67. [CrossRef]
- Nazzal, W., Thurnhofer-Hemsi, K., & López-Rubio, E. (2024). Improving medical image segmentation using test-time augmentation with medsam. Mathematics, 12(24), 4003. [CrossRef]
- Moshkov, N., Mathe, B., Kertesz-Farkas, A., Hollandi, R., & Horvath, P. (2020). Test-time augmentation for deep learning-based cell segmentation on microscopy images. Scientific reports, 10(1), 5068. [CrossRef]



| Finding | Patients with Active Osteitis (N =276) | Male (n, %) | Female (n, %) | Male Age (mean) | Female Age (mean) |
| Bone Marrow Edema | 276 (100.0%) | 104 (37.7%) | 172 (62.3%) | 35.0 | 39.6 |
| Erosion | 209 (75.8%) | 83 (30.0%) | 126 (45.7%) | 35.6 | 40.8 |
| Sclerosis | 157 (56.9%) | 62 (22.3%) | 95 (34.4%) | 35.2 | 41.0 |
| Fatty Deposition | 92 (33.3%) | 47 (17.0%) | 45 (16.3%) | 37.9 | 41.8 |
| Joint Space Narrowing | 92 (33.3%) | 49 (17.7%) | 43 (15.6%) | 36.0 | 38.3 |
| Ankylosis | 13 (4.6%) | 11 (4.0%) | 2 (0.7%) | 36.1 | 58.5 |
| Finding | Estimated n (276) | Male (n, %) | Female (n, %) | Male Age (mean) | Female Age (mean) |
| Inflammatory Back Pain | 224 (81.3%) | 90 (32.7%) | 134 (48.6%) | 34.3 | 39.4 |
| Morning Stiffness | 198 (71.7%) | 79 (28.6%) | 119 (43.1%) | 33.8 | 40.1 |
| Psoriasis | 13 (4.7%) | 4 (1.4%) | 9 (3.3%) | 48.2 | 37.0 |
| IBD | 4 (1.4%) | 3 (1.1%) | 1 (0.3%) | 29.0 | 46.0 |
| Uveitis | 8 (2.9%) | 6 (2.2%) | 2 (0.7%) | 38.3 | 42.7 |
| Arthritis | 80 (29.0%) | 33 (12.0%) | 47 (17.0%) | 36.1 | 44.7 |
| Enthesitis | 35 (12.7%) | 11 (4.0%) | 24 (8.7%) | 36.5 | 43.7 |
| Dactylitis | 2 (0.7%) | 1 (0.4%) | 1 (0.4%) | 30.0 | 44.0 |
| Family History | 30 (10.9%) | 12 (4.3%) | 18 (6.5%) | 34.8 | 42.4 |
| HLA-B27 Positivity | 16 (5.8%) | 10 (3.6%) | 6 (2.2%) | 31.0 | 37.7 |
| CRP Positive | 100 (36.2%) | 43 (15.6%) | 57 (20.7%) | 33.9 | 42.1 |
| NSAID Response | 177 (64.1%) | 52 (18.8%) | 125 (45.3%) | 35.9 | 40.3 |
| Finding | Estimated n (276) | Male (n, %) | Female (n, %) | Male Age (mean) | Female Age (mean) |
| Rheumatoid Arthritis | 16 (5.8%) | 5 (1.8%) | 11 (3.6%) | 20.6 | 35.6 |
| Psoriatic Arthritis | 5 (1.8%) | 2 (0.7%) | 3 (1.1%) | 46.0 | 31.0 |
| Behçet's Disease | 2 (0.7%) | 2 (0.7%) | 0 (0.0%) | 32.5 | - |
| Familial Mediterranean Fever | 13 (4.7%) | 8 (2.5%) | 5 (1.8%) | 29.0 | 36.6 |
| Sjögren's Syndrome | 2 (0.7%) | 0 (0.0%) | 2 (0.7%) | - | 59.5 |
| Systemic Lupus Erythematosus | 2 (0.7%) | 0 (0.0%) | 2 (0.7%) | - | 40.5 |
| Scleroderma | 2 (0.7%) | 0 (0.0%) | 2 (0.7%) | - | 40.0 |
| Juvenile Rheumatoid Arthritis | 2 (0.7%) | 2 (0.7%) | 0 (0.0%) | 18 | - |
| Metric | Baseline ResNet-18 | Proposed System | Description |
|---|---|---|---|
| Accuracy | %75.29 | %85.88 | The proportion of correctly classified cases across all subjects |
| Sensitivity | %85.71 | %92.86 | The proportion of inflame patients correctly identified as positive |
| Specificity | %65.12 | %79.07 | The proportion of healthy individuals correctly identified as negative |
| F1-Score | 0.7742 | 0.8667 | The harmonic mean of sensitivity and positive predictive value (precision) |
| Metric | ResNet-18 | Proposed Method |
| True Positive (TP) | 36 | 39 |
| False Negative (FN) | 6 | 3 |
| True Negative (TN) | 28 | 34 |
| False Positive (FP) | 15 | 9 |
| Study | Method | Independent Test Accuracy (Acc) |
| Nicolaes et al. (2024) | Deep Learning (731 patients) | % 74.00 |
| Bressem et al. (2022a) | Deep Learning (EULAR abstract) | % 75.00 |
| Bressem et al. (2022b) | Deep Learning (Radiology journal) | % 75–79 |
| Faleiros et al. (2020) | Classical ML (MLP) | % 75–82 |
| Liu et al. (2024) | Semi-supervised Radiomics | % 81.20 |
| Roels et al. (2023) | Machine Learning (ResNet-18) | % 81.40 |
| Zhang et al. (2024) | Radiomics + Clinical Hybrid Model | % 85.60 |
| This Study | Gated Attention MIL | %85,88 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).