Preprint
Review

This version is not peer-reviewed.

Application of Artificial Intelligence in Shoulder Pathology

A peer-reviewed article of this preprint also exists.

Submitted:

08 April 2024

Posted:

08 April 2024

You are already at the latest version

Abstract
The artificial intelligence (AI) refers to the science and engineering of creating intelligent machines for imitating and expanding human intelligence. Given the ongoing evolution of the multidisciplinary integration trend in modern medicine, numerous studies have investigated the power of AI to address orthopaedic-specific problems. One particular area of investigation focuses on the shoulder pathology, which is a range of disorders or abnormalities of shoulder joint, causing pain, inflammation, stiffness, weakness, and reduced range of motion. There has not yet been a comprehensive review of the recent advancements in this field. Therefore, the purpose of this review is to evaluate the current AI applications in the shoulder pathology. This review mainly summarizes several crucial stages of the clinical practice, including predictive models and prognosis, diagnosis, treatment, and physical therapy. In addition, the challenges and future development of AI are also discussed.
Keywords: 
;  ;  ;  

1. Introduction

The artificial intelligence (AI) is the science and engineering of creating intelligent machines for imitating and expanding human intelligence. It is a branch of computer science that has the remarkable ability to perform tasks by simulating human cognitive functions[1]. Through the analysis and comparison of extensive datasets, AI technology has been engaged in innovative applications in the field of medicine, and revolutionized approaches to various healthcare challenges[2,3,4].
In 1955, John McCarthy and his colleagues embarked on a research project to explore the feasibility of reproducing all aspects of human intelligence using a machine/computer with minimal human involvement[5]. This pivotal endeavor marked the establishment of the field of AI, which became the foundation of subsequent computer research and development. The early accomplishments of AI could be due to the fact that it was proficient in solving tasks that were easy to formally program but challenging for humans to execute[7]. Paradoxically, tasks that appeared effortless for humans, showed greater challenges due to their reliance on intuitive processes, making them inherently arduous to formalize through coding[8,9].
Machine Learning (ML) was firstly introduced as a subset of AI in 1959 (Figure 1)[10]. ML focuses on the development of algorithms and models that enable computers to learn and make decisions on the basis of input data. Instead of being explicitly programmed to perform a certain task, the ML uses statistical techniques to learn from examples and adjust their internal parameters (weights), and progressively improve with experience[11]. According to learning process, ML can be broadly classified into three groups: supervised learning, unsupervised learning and reinforcement learning[12]. In supervised learning, the algorithm is trained on labeled data following explicit instructions, which means that it is given input-output pairs and learns to map the input to the output [13]. Unsupervised learning involves training on unlabeled data, and the algorithm should find patterns or structure within the data, which can potentially reveals hidden patterns yet to be recognized by humans[14]. In reinforcement learning, the algorithm was trained to make sequences of decisions by rewarding or punishing it based on its actions[15]. Additionally, some ML certain models lack interpretability or transparency, which is called the black box phenomenon. It means that although these models can make accurate predictions or classifications, the underlying reasoning or decision-making process is not easily understandable by humans[16]. Nevertheless, despite its complexities and occasional opacity, ML has immense impacts on various domains, ranging from image recognition and natural language processing to recommendation systems and autonomous vehicles[17].
Deep learning (DL), as a highly sophisticated advancement of ML, was proposed in the 1980s. It emerged from the neural network research conducted by Geoffrey Hinton and his colleagues[18]. It is a powerful methodology of unsupervised learning, and specifically designed to understand complex patterns and relationships in large datasets [19]. DL algorithms are inspired by the intricate connectivity and function of the human brain, and composed of artificial neural networks (ANNs) with multiple layers to perform complex tasks[20]. An ANN includes an input layer, multiple intermediate layers, and an output layer. Each layer comprises interconnected nodes, called artificial neurons or units, which process and transform the input data. The output of one layer acts as the input for the next layer, allowing the network to learn hierarchical representations of the data [21]. The key advantage of DL is its ability to automatically learn and extract relevant features from unstructured and unlabeled data, and eliminate the need for manual feature engineering. By iteratively adjusting the weights and biases of the neural network during training, DL models can discover complex and abstract representations that capture intricate patterns present in the data[16,22]. The availability of large datasets and advances in computational power have contributed to the rapid growth and state-of-the-art performance in tasks like image classification, object detection, machine translation and so on. [23,24].
Convolutional Neural Networks (CNNs) are a type of DL algorithm that is highly suitable for analyzing visual data such as images and videos[25]. CNNs are designed to automatically and adaptively learn spatial hierarchies of features from the input data through the use of convolutional layers, which apply filters (also known as kernels) to the input data to extract relevant features. These filters are learned during the training process, allowing the network to identify patterns such as edges, textures, and shapes at different scales[26,27]. CNNs also typically include other types of layers, such as pooling layers, which downsample the feature maps to reduce computational complexity, and fully connected layers[28].
Over the last few years, AI technology has been integrated into the field of medicine for multiple purposes, such as clinical diagnosis, decision support, electronic health records, personalized treatments, drug discovery and development, and patient care and assistance, and health monitoring[9,29]. It also has been successfully utilized in various subspecialties of orthopedics. One particular area focuses on the shoulder pathology, which has experienced rapid growth, especially in the past five years[30](Figure 2). Shoulder pathology refers to a range of disorders or abnormalities affecting the shoulder joint and surrounding soft tissues. These conditions can cause pain, inflammation, stiffness, weakness, and reduced range of motion. Some of the common shoulder pathologies include rotator cuff tears (RCTs), shoulder impingement syndrome (SIS), shoulder instability, shoulder osteoarthritis and adhesive capsulitis[30,31,32]. However, there has not yet been a comprehensive review of the recent advancements in this field. Therefore, the purpose of this review is to evaluate the current AI applications in the literature concerning shoulder pathologies. This review will explore several crucial stages of the clinical process, including predictive models and prognosis, diagnosis, treatment, and physical therapy. A glossary of key terms associated with AI is provided in Table 1.

2. Methods

2.1. Search Strategy

A systematic literature review was conducted following the guidelines of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA). One reviewer carried out structured searches on the PubMed, Google Scholar, and ScienceDirect databases to retrieve all relevant articles published from January 1, 2010 to January 1, 2024. The search query included the terms: (artificial intelligence OR machine learning OR deep learning) AND (shoulder OR shoulder pathology OR shoulder pain OR shoulder disorder OR shoulder surgery OR rotator cuff OR shoulder fracture OR shoulder tendinopathy). The titles, abstracts, and full-text articles were independently screened by two reviewers. The reference lists of the included articles were also reviewed and cross-referenced to identify any other additional relevant studies that were not retrieved through the keyword search.

2.2. Eligibility Criteria and Article Selection

Study eligibility was determined using standardized inclusion and exclusion criteria. Disagreements or discrepancies were resolved through consensus. The inclusion criteria were as follows: (1) Full-length original articles involving Al or ML or DL applications in the shoulder pathologies; (2) diagnosis or treatment relevant to orthopedists; (3) randomized controlled trials (RCTs), non-randomized studies or observational studies (4) published in English. Exclusion criteria were as follows: (1) review articles, conferences papers, book chapters, letters to the editor and (2) animal studies, post-mortem studies.
The screening process is shown in Figure 3. After literature screening, 41 studies were included in this review.

3. Rotator Cuff Tears (RCTs)

3.1. Diagnosis

The diagnosis of RCTs is important for providing timely and accurate treatment for patients. AI technology could analyze medical images with high accuracy, and reduce the risk of misdiagnosis.
X-rays are widely recognized for their high negative predictive value (NPV) in diagnosing RCTs [33]. A specific DL algorithm was trained on 6,793 shoulder radiograph series and tested on 1,095 radiograph series by Kim et al. The results showed that the algorithm accurately ruled out significant RCTs based on shoulder radiographs, with a sensitivity of 97.3%, NPV of 96.6%, and the negative likelihood ratio (LR-) of 0.06. Subgroup analysis revealed that age <60 years, non-dominant side, absence of trauma history, and ultrasound examination were associated with negative test results, and NPVs were higher in patients younger than 60 years and those examined with ultrasound[33]. In another study, a DL algorithm was developed to evaluate subscapularis tendon tears using axillary lateral shoulder radiography. A dataset of 2,779 radiographs was used for training, and the algorithm outputted the probability of an subscapularis tendon tear exceeding 50% thickness. The algorithm's performance was validated by two distinct test sets, with arthroscopy and MRI findings serving as the reference gold standard, respectively. Performance evaluation yielded area under the curve (AUC) of 0.83 and 0.82 for two test sets. At the high-sensitivity cutoff point, the sensitivity was 91.4% and 90.2%, NPV was 90.4% and 89.5% for the respective test sets. The algorithm successfully identified the subscapularis insertion site at the lesser tuberosity as the most sensitive region[34]. Iio et al. also developed a DL algorithm using shoulder radiography as a screening tool for RCTs, which showed high diagnostic performance for full-thickness tears, with an AUC of 0.82, sensitivity of 94.5%, NPV of 96.2%, and LR- of 0.10[35].
Currently, Magnetic resonance imaging (MRI) are widely recognized as the most efficient and reliable techniques for examining RCTs without invasive procedures. DL has shown promise in accurately detecting and classifying RCTs on shoulder MRI scans. A DL model was developed using 11,925 MRI scans by Lin et al. [36]. The model achieved excellent performance, with an AUC of 0.93 for supraspinatus tears, 0.89 for infraspinatus tears, and 0.90 for subscapularis tears. Notably, it demonstrated high accuracy for full-thickness tears with AUCs of 0.98, 0.99, and 0.95 for the respective tendons. Additionally, Multisequence input yielded improved results for some tear types. The accuracy of the DL model compared favorably to specialized radiologists, highlighting its potential as a valuable tool in clinical practice.
DL is also a viable approach for the automated detection classification and segmentation of supraspinatus tears on MRI scans. A total of 200 shoulder MRI scans were retrospectively collected by Yao et al.[37], which contained full-thickness tears, partial-thickness tears, or intact supraspinatus tendons. The researchers developed a 3-stage pipeline, including a slice selection network, a segmentation network based on an encoder-decoder architecture (U-Net), and a custom multi-input classifier. The DL model achieved a sensitivity of 85.0%, specificity of 85.0%, AUC of 0.943, and Dice similarity coefficient (DSC) of 0.814. No significant difference in accuracy was observed between 1.5 T and 3.0 T MRI scans.
CNNs play a crucial role in enhancing the analysis and interpretation of shoulder MRI scans. A 2D CNN model was developed by Guo et al. to automatically detect supraspinatus tears, trained on 701 shoulder MRIs and validated on 69 arthroplasty MRIs[38] . The model showed optimal performance, achieving high F1-scores and sensitivity on both surgery and internal test sets. Subgroup analyses confirmed its robustness across tear degrees and MRI field strengths. The comparison in diagnostic accuracy with clinicians revealed that the model was equivalent to senior clinicians and better than junior clinicians.
The 2D CNNs process data in two parameters, namely width and height, while 3D CNNs can capture more complex patterns and relationships in the data by incorporating the additional dimension (depth or time), making 3D CNNs more suitable for analyzing volumetric data. A 3D U-Net CNN algorithm was developed to identify, segment, and visually represent RCTs in 3D, using MRI data from 303 patients with RCTs [39]. Two shoulder specialists labeled the RCTs in the entire MRIs using in-house developed software. The CNN algorithm was trained following the augmentation of a training dataset and tested using randomly selected test data, maintaining a 6:2:2 ratio for training, validation, and test data. The 3D U-Net CNN successfully detected, segmented, and visualized RCT areas with a DSC of 94.3%, sensitivity of 97.1%, specificity of 95.0%, precision of 84.9%, F1-score of 90.5%, and Youden index of 91.8%. Thus the proposed method demonstrated high accuracy and successful 3D visualization. Shim et al.[40] used the Voxception-ResNet (VRN) structure to train a 3D CNN algorithm to automatically detect RCT presence, classify tear size, and visualize tear location in 3D on a dataset of MRI data from 2,124 patients. The proposed method indicated the superiority over orthopedists in terms of accuracy and specificity. Moreover, the generated 3D class activation map (CAM) provides valuable information on tear localization and size.
Shoulder MRI using standard multiplanar sequences often requires long scan time. However, accelerated sequences, although providing shorter scan time, have limitations in terms of noise and resolution. To address this, DL-based reconstruction (DLR) has been proposed as a potential solution to reduce scan time while preserving image quality. In a retrospective study involving 105 patients who underwent 110 shoulder MRI examinations, standard sequences (scan time: 9 minutes 23 seconds) and accelerated sequences (scan time: 3 minutes 5 seconds; 67% reduction) were compared. The standard sequences were reconstructed conventionally, while the accelerated sequences were reconstructed using both conventional and DLR pipelines. Two radiologists evaluated the images for subjective image quality, artifacts, and specific pathologies. Diagnostic accuracy was assessed using arthroscopic findings as the reference standard in 27 patients who underwent arthroscopy. The results indicated that the accelerated sequences with DLR provided similar subjective image quality, artifacts, and diagnostic performance compared to standard sequences[41]. Liu et al.[42] showed a significantly reduced scan time (6 min 1 s vs. 11 min 25 s) and higher image quality in DLR MRI compared to conventional method. The image quality satisfaction survey among 400 patients received high scores in DL-MRI from all radiologists. Kaniewska et al.[43] also declared that DLR could improve diagnostic accuracy and image quality with thorough assessment of the subacromial bursa and good agreement for other shoulder structures.
Ultrasound imaging has been identified as a valid alternative to MRI. Ultrasound imaging offers several advantages over MRI including real-time imaging, cost-effective, wide availability, and dynamic assessment [44]. However, speckle noise can degrade image resolution in ultrasound imaging, making conventional vision-based algorithms ineffective for segmenting diseased regions. Lee et al.[45] proposed a novel fully CNN called Segmentation Model Adopting a pre-trained Classification Architecture (SMART-CA), which incorporated an integrated positive loss function (IPLF) to accurately diagnose the locations of RCTs using ultrasound imaging during orthopedic examinations. SMART-CA utilizes a pre-trained network to extract distinct features that improve segmentation accuracy. IPLF efficiently optimizes SMART-CA for imbalanced datasets like RCT. The experimental results indicated that SMART-CA with IPLF achieved improved precision, recall, and DSC, and is robust for segmentation in the presence of speckle noise, outperforming the existing state-of-the-art networks. In another study, a total of 194 ultrasound images were used to train and test five pre-trained CNN models. Among them, DenseNet121 demonstrated the best classification performance with 88.2% of accuracy, 93.8% of sensitivity, 83.6% of specificity, and AUC score of 0.832. A gradient-weighted class activation mapping (Grad-CAM) highlighted the sensitive features in the learning process on ultrasound image[46].

3.2. Predictive Models and Prognosis

In addition to diagnosis of RCTs, AI technology have been applied in the predictive models to evaluate the functional and anatomical outcomes according to various pre-operative factors. The occupation ratio and fatty infiltration of the supraspinatus muscle are crucial parameters for predicting the diagnosis and treatment prognosis of RCTs. Ro et al.[47] employed a DL model to segment the supraspinatus muscle and fossa regions by quantitatively measuring the occupation ratio of the supraspinatus muscle, and calculate the amount of fatty infiltration of the supraspinatus muscle using the Otsu thresholding technique on MRI scans. The model exhibited high DSC, accuracy, sensitivity, and specificity in the segmentation. The fatty infiltration measure significantly varied across different Goutallier grades[48]. Furthermore, a strong negative correlation was observed between occupation ratio and fatty infiltration. Kim et al.[49] also proposed a DL model from an MRI dataset of 240 patients with various disease severities to detect the supraspinatus muscle and fossa regions, which result achieved high accuracy and DSC. They declared that this model could assist clinicians to accurately track the preoperative and postoperative changes in muscle volume of the supraspinatus fossa. Besides evaluating MRI dataset, Taghizadeh et al.[50] developed and verified a CNN model that can automatically measure and characterize the degeneration of rotator cuff muscles in a total of 103 shoulder CT scans from 95 patients with glenohumeral osteoarthritis. The automatic CNN segmentations determined comparable DSC to the manual ones. The CNN model also quantified muscle atrophy, fatty infiltration and overall muscle degeneration in a rapid approach, providing accurate and reliable predictions. In addition, Medina et al. [51] performed automated segmentations of shoulder MRI images using two CNN models, in which Model A was created for Y-view selection, and Model B was for muscle segmentation. They concluded that the combination of deep CNN algorithms could achieve overall accurate and reliable Y-view selection and automated algorithm muscle segmentation.
In the context of identifying the relationship between important clinical features and the prediction of RCTs, Li et al.[52] conducted a retrospective trial including patients with shoulder pain and dysfunction who underwent questionnaires and physical examinations in outpatient settings. Six ML models were developed and assessed using accuracy, AUC, and Brier scores. Among them, the XGBoost model exhibited superior performance. Moreover, the Shapley plot highlighted the Jobe test, Bear hug test, and age as the most important variables on predicting RCTs. Potty et al.[53] created a ML model to predict post-operative functional outcomes following arthroscopic rotator cuff repair by collecting pre-operative and post-operative patient data. The proposed model successfully predicted post-operative scores accurately. The most essential features in predicting patient recovery were identified as pre-operative American Shoulder and Elbow Surgeons (ASES) score, pre-operative pain score, body mass index (BMI), age, and tendon quality. They declared that it is valuable for pre-operative counseling, planning, and resource allocation. The main studies of AI applications in the RCTs are summarized in Table 2.

3.3. Physical Therapy

Physical therapy has been established as an effective treatment for RCTs, resulting in significant improvements in patient-reported outcomes and reducing the need for surgery. However, poor adherence to physical therapy programs became a challenge to effectively managing common shoulder disorders, particularly with unsupervised home exercise programs[30]. To address this, twenty healthy adults without prior shoulder disorders participated in the study of Burns et al.[54], performing seven exercises from an evidence-based rotator cuff physiotherapy protocol while data from a 6-axis inertial sensor on the active extremity was collected. Four supervised DL algorithms were trained and optimized within an activity recognition chain framework to classify the exercises. The algorithms' performance was evaluated using 5-fold cross-validation, first temporally and then by subject. All algorithms achieved categorical classification accuracy of over 94% in the temporally stratified cross-validation, with the convolutional recurrent neural network (CRNN) algorithm performing the best at 99.4%. They proved the technical feasibility of using such an approach to monitor and assess adherence to shoulder physiotherapy exercise protocols at home.

4. Shoulder Instability

4.1. Diagnosis

To date, there is no definitive evidence regarding glenohumeral translation in dynamic glenohumeral joint stability models. Therefore, a standardized method for assessing shoulder kinematics can provide a clear understanding and be beneficial for patient treatment. Croci et al.[55] obtained fluoroscopic images for both shoulders of 12 participants with unilateral RCTs and 13 patients with asymptomatic subjects. They designed a 3D full-resolution CNN (nnU-Net) model to automatically locate five landmarks (glenohumeral joint center, humeral shaft, inferior and superior edges of the glenoid, calibration sphere, and the most lateral point of the acromion). As a result, the model achieved accurate landmark detection, with all landmarks and the calibration sphere located within 1.5 mm, except for the humeral landmark with a difference of 9.6 mm. This proposed model provides a reliable and efficient means of automatically identifying and tracking anatomical landmarks, enabling measurement of clinically relevant anatomical configurations and investigation of dynamic glenohumeral joint stability in pathological shoulders.
In terms of assessing osseous injuries associated with anterior shoulder instability, CT scans of the shoulder with 3D reconstruction are considered as the gold standard. The CT scans provide improved conceptualization and accurate quantification of injuries at the glenoid and humeral head[56]. However, this method exposes patients to much radiation. An alternative approach involving the use of 3D MRI models has been advocated recently, which can be obtained and reconstructed during standard 2D MRI of the shoulder. The 3D MRI models have demonstrated equal effectiveness in evaluating bipolar bone loss[57]. Rodrigues et al.[58] collected shoulder MRI images from 100 patients and developed a fully automated segmentation 3D CNN model for proton density-weighted images. The CNN model showed high accuracy in segmenting the humerus and glenoid, and in evaluation of glenohumeral anatomy and GBL.
Wei et al.[59] collected radiographs of 106 elbows and 140 shoulders, half of which had dislocations. Multiple CNN models was trained and tested using datasets from external hospitals and online radiology repositories. The CNNs achieved high accuracy in identifying joint dislocations, with AUCs greater than 0.99 on internal test sets and greater than 0.97 on external test sets. The CAMs indicated that the CNNs accurately identified relevant joints regardless of the presence of dislocations with excellent generalizability to external test sets.

4.2. Predictive Models and Prognosis

The predictors of optimized functional outcomes after surgery for anterior shoulder instability (ASI) from a global perspective, rather than domain-specific perspectives, remain elusive. Till et al.[60] used a ML clustering to identify predictors for achieving the "optimal observed outcome" after surgery for ASI. Medical records, images, and operative data of patients under 40 years old were analyzed. Of the 200 patients with an average follow-up of 11 years, 64% achieved the "optimal observed outcome" characterized by reduced postoperative pain, low rates of recurrent instability, revision surgery, osteoarthritis, and improved range of motion. Additionally, 41% achieved a "perfect outcome" across all categories. Negative predictors included longer time from initial instability to presentation and habitual/voluntary instability, while a predilection toward preoperative subluxations was a positive predictor.

5. Rotator Cuff Calcific Tendinopathy (RCCT)

RCCT is one of the most common causes of shoulder pain. It is characterized by the deposition of calcium hydroxyapatite crystals either inside or around rotator cuff tendons. Although published studies have highlighted a wide range of risk factors for the onset of RCCT, including endocrine disorders, hyperlipidemia, and sports strain, the etiology of symptomatic RCCT is currently debatable[61,62].
AI technology offers promising advancements in predicting RCCT onset and facilitating targeted early-stage treatment. Dong et al.[63] retrospectively analyzed clinical data from individuals diagnosed with RCCT at their institution. Logistic regression analysis revealed that female gender, hyperlipidemia, diabetes mellitus, and hypothyroidism were independent risk factors for symptomatic RCCT. Gender-stratified analysis showed overlapping risk factors for both men and women. Hyperlipidemia, diabetes mellitus, and hypothyroidism were significantly associated with symptomatic RCCT in men, while diabetes mellitus was significant in women. These findings emphasized the importance of individualized risk assessment for early intervention.
Ultrasound imaging is regarded as an excellent imaging tool to visualize calcifications within the rotator cuff tendons. Vassalou et al.[64] evaluated the performance of two ML models in predicting long-term complete pain resolution following ultrasound-guided percutaneous irrigation of calcific tendinopathy (US-PICT) in 100 patients with rotator cuff disease. The two models incorporated data related to procedural details, patient characteristics, and calcification properties to predict pain at 1 year post-US-PICT. The results showed an AUC of 69.2% for predicting complete pain resolution at 1 year, with age and baseline VAS scores being the most influential variables. Furthermore, the inclusion of VAS scores at 1 month did not significantly improve the models’ performance, indicating that the models could be beneficial in predicting patient outcomes following US-PICT. Chiu et al.[65] declared that their DL model was able to assist clinicians in diagnosing supraspinatus calcific tendinopathy during ultrasound examinations with high accuracy, sensitivity, and specificity.

6. Proximal Humeral Fractures (PHFs)

Accurate diagnosis and classification of PHFs fractures are essential for appropriate treatment planning. The complexity arises from the variability in fracture patterns, which can make it difficult to precisely determine the fracture type based solely on visual inspection. Factors such as overlapping bone fragments, subtle displacement, and the presence of associated injuries can further complicate the classification process[64,65].
Automation of fracture classification with AI technology has been proved to improve diagnostic accuracy, reduce inter-observer variability, and accelerate the classification process. A CNN model trained by Chung et al.[66] demonstrated exceptional performance, with a top-1 accuracy of 96% and an AUC of 1.00 for distinguishing PHFs from normal shoulder radiographs. Additionally, the CNN model showed promising results in classifying fracture types based on the Neer classification system, achieving top-1 accuracy ranging from 65% to 86% and AUC ranging from 0.90 to 0.98. Furthermore, the model outperformed orthopedists in both detection and classification tasks. Notably, the CNN model's superiority was more obvious in complex three-part and four-part fractures. Magneli et al.[67] also evaluate the classification performance of the CNN model for PHFs based on the AO/OTA classification system. The overall AUC for fracture classification was 0.89, including excellent AUC for diaphyseal humerus fractures (0.97), clavicle fractures (0.96), and good AUC for scapula fractures (0.87), which showed that the proposed model could effectively utilize plain radiographs and classify fractures. Dipnall et al.[68] assessed the classification performance of several ML algorithms based on the Neer classification system from six input text datasets, including X-ray and/or CT scan data and patient age and/or sex information. They declared that these ML algorithms achieved satisfactory performance, with one special model exhibiting good accuracy at 61% and an excellent One-versus-rest score above 0.8, providing valuable assistance to radiologists and orthopedists by speeding up the classification process.

7. Other Shoulder Pathologies

Scapulohumeral periarthritis, also known as periarthritis of the shoulder, is characterized by a gradual development of shoulder pain, which is more pronounced at night with limited functions [69]. Yu et al.[70] examined the efficiency of combining an intelligent clustering analysis algorithm with musculoskeletal ultrasound imaging for the differential diagnosis and rehabilitation of scapulohumeral periarthritis. The thickness and clarity of the shoulder posterior capsule were observed in different pain groups. Factors such as musculoskeletal ultrasound parameters, length of service, work nature, and work busyness significantly influenced shoulder periarthritis pain. The proposed intelligent algorithm indicated promising accuracy, sensitivity, and specificity when tested on clinical ultrasound samples.
Subacromial impingement syndrome (SIS) is another common disorder causing shoulder pain. Shu et al.[71] included 17 participants performing shoulder abduction and adduction while their ultrasound images were captured. The CNN model accurately depicted the trajectory of the humeral greater tubercle in relation to the lateral acromion. Subacromial motion metrics from dynamic ultrasonography were extracted using different CNN models. Consequently, the self-transfer learning-based(STL) CNN model performed better than the traditional CNN model. The errors in measuring the minimal vertical acromiohumeral distance were significantly smaller using the STL-CNN models. This study successfully demonstrated the feasibility of using the CNN model for automatic detection of anatomical landmarks and capturing essential motion metrics in dynamic shoulder ultrasonography, which was helpful in diagnosing SIS. Jiang et al.[72] included 10 radiomic features for radiomics model construction in the ML-based ultrasomic analysis of SIS stage evaluation. They stated that the ML-derived ultrasomics model could provide reliable stage evaluations in patients with SIS.
Shoulder pain attributed to inflammation of the long head of the biceps tendon is a prevalent condition. Bicipital peritendinous effusion (BPE) is the most frequently occurring abnormality associated with the biceps tendon and is connected to different shoulder injuries[73]. Obtaining a clear and accurate ultrasound image is difficult for inexperienced radiologists[74]. An automated BPE recognition system was designed by Lin et al.[75] to classify inflammation into four categories: normal, mild, moderate, and severe in ultrasound imaging. Three experiments were accordingly conducted to validate the classification performance of the recognition system under different settings and situations. Ultimately, the proposed system achieved an accuracy of 75% for three-class BPE classification (normal, moderate, and severe) and revealed comparable results to other state-of-the-art methods.
Grauhan et al.[76] trained a CNN model to detect the most common causes of acute or chronic shoulder pain in 2700 plain radiographs, which were reviewed and labeled for six findings. The developed CNN model achieved a high accuracy with an AUC of 0.871 for PHFs, 0.896 for joint dislocation, 0.945 for osteoarthritis, and 0.800 for periarticular calcifications. It also demonstrated near-perfect accuracy in detecting osteosynthesis and endoprosthesis with AUC 0.998 and 1.0 respectively, supporting that such a CNN model could provide additional assistance and safety for the clinicians on duty.

8. Future of AI

The future applications of AI in shoulder diagnosis and treatment are promising. AI technology can continue to evolve and improve in their ability to analyze medical imaging for shoulder pathologies. This can lead to earlier and more accurate detection of conditions like RCTs, PHFs, and shoulder instability, etc. AI models can be trained on large datasets to identify patterns and predict treatment outcomes for specific shoulder conditions, which can assist clinicians in selecting the most effective treatments, including recommending optimal rehabilitation protocols or surgical interventions based on individual patient characteristics[1,77]. Furthermore, AI-powered virtual assistants and chatbots can provide patients with personalized guidance, educational materials, and remote monitoring for post-treatment care, enhancing patient engagement, adherence to treatment plans, and enable continuous support even outside of clinical settings. In addition, AI-powered wearable devices and sensors can track shoulder function, range of motion, and strength during rehabilitation. This data can be analyzed by AI algorithms to provide personalized feedback, exercise recommendations, and progress tracking. It can also help monitor patients remotely, allowing for early identification of any complications or deviations from the expected recovery trajectory[77].
However, there are several challenges that need to be addressed. One critical challenge is the lack of standardization and quality control of datasets used to train AI algorithms. The accuracy and generalizability of AI algorithms depend significantly on the quality of data used to train them. There is a need for standardized protocols for data collection, annotation, and curation to ensure the reliability and consistency of AI models. Another challenge is the interpretability and transparency of AI models. As AI algorithms become more complex, it becomes increasingly difficult to understand how they make decisions. This poses challenges for clinicians who rely on these models to inform clinical decision-making[78]. It is crucial for developing interpretable AI models that provide clear explanations for their outputs[79]. Furthermore, AI models often deal with vast amounts of personal health information. Legislative measures help establish guidelines for the collection, storage, and usage of this data, ensuring that individuals' privacy rights are protected. AI models can inadvertently perpetuate biases or discriminate against certain groups if they are trained on biased or incomplete datasets. Legislative frameworks can address these ethical concerns by requiring transparency in AI models and promoting fairness and inclusivity[80].

Author Contributions

Conceptualization, C.C. and D.X.; investigation, C.C. and X.L.; data curation, C.C., X.L. and D.G.; writing-original draft preparation, C.C; writing-review and editing, D.X. and X.L.; visualization, C.C.; supervision, X.L., D.G. and D.X.; All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Liu, P.R.; Lu, L.; Zhang, J.Y.; et al. Application of Artificial Intelligence in Medicine: An Overview. Curr. Med. Sci. 2021, 41, 1105–1115. [Google Scholar] [CrossRef] [PubMed]
  2. Poduval, M.; Ghose, A.; Manchanda, S.; et al. Artificial Intelligence and Machine Learning: A New Disruptive Force in Orthopaedics. Indian J. Orthop. 2020, 54, 109–122. [Google Scholar] [CrossRef] [PubMed]
  3. Haug, C.J.; Drazen, J.M. Artificial Intelligence and Machine Learning in Clinical Medicine. N. Engl. J. Med. 2023, 388, 1201–1208. [Google Scholar] [CrossRef] [PubMed]
  4. Chen, A.F.; Zoga, A.C.; Vaccaro, A.R. Point/Counterpoint: Artificial Intelligence in Healthcare. Healthcare Transformation. 2017, 2, 84–92. [Google Scholar] [CrossRef]
  5. McCarthy, J.; Minsky, M.L.; Rochester, N.; et al. A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence. AI Mag. 2006, 27, 4. [Google Scholar]
  6. Chen, K.; Stotter, C.; Klestil, T.; Nehrer, S. Artificial Intelligence in Orthopedic Radiography Analysis: A Narrative Review. Diagnostics (Basel). 2022, 12, 2235. [Google Scholar] [CrossRef]
  7. Mintz, Y.; Brodie, R. Introduction to Artificial Intelligence in Medicine. Minim. Invasive Ther. Allied Technol. 2019, 28, 73–81. [Google Scholar] [CrossRef] [PubMed]
  8. Loftus, T.J.; Tighe, P.J.; Filiberto, A.C.; et al. Artificial Intelligence and Surgical Decision-Making. JAMA Surg. 2019, 155, 2. [Google Scholar]
  9. Ramkumar, P.N.; Kunze, K.N.; Haeberle, H.S.; et al. Clinical and Research Medical Applications of Artificial Intelligence. Arthroscopy. 2021, 37, 1694–1697. [Google Scholar] [CrossRef] [PubMed]
  10. Samuel, A.L. Some Studies in Machine Learning Using the Game of Checkers. IBM J. Res. Dev. 2000, 44, 206–226. [Google Scholar] [CrossRef]
  11. Choi, R.Y.; Coyner, A.S.; Kalpathy-Cramer, J.; et al. Introduction to Machine Learning, Neural Networks, and Deep Learning. Transl. Vis. Sci. Technol. 2020, 9(2), 14. [Google Scholar]
  12. Sidey-Gibbons, J.A.M.; Sidey-Gibbons, C.J. Machine Learning in Medicine: A Practical Introduction. BMC Med. Res. Methodol. 2019, 19(1), 64. [Google Scholar] [CrossRef] [PubMed]
  13. Razavian, N.; Knoll, F.; Geras, K.J. Artificial Intelligence Explained for Nonexperts. Semin. Musculoskelet. Radiol. 2020, 24(1), 3–11. [Google Scholar] [CrossRef] [PubMed]
  14. Cabitza, F.; Locoro, A.; Banfi, G. Machine Learning in Orthopedics: A Literature Review. Front. Bioeng. Biotechnol. 2018, 6, 75. [Google Scholar] [CrossRef] [PubMed]
  15. Brown, N.; Sandholm, T. Superhuman AI for Multiplayer Poker. Science. 2019, 365(6456), 885–890. [Google Scholar] [CrossRef] [PubMed]
  16. Myers, T.G.; Ramkumar, P.N.; Ricciardi, B.F.; et al. Artificial Intelligence and Orthopaedics: An Introduction for Clinicians. J. Bone Joint Surg. Am. 2020, 102(9), 830–840. [Google Scholar] [CrossRef] [PubMed]
  17. Beam, A.L.; Kohane, I.S. Big Data and Machine Learning in Health Care. JAMA. 2018, 319(13), 1317–1318. [Google Scholar] [CrossRef] [PubMed]
  18. LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature. 2015, 521(7553), 436–444. [Google Scholar] [CrossRef] [PubMed]
  19. Chlap, P.; Min, H.; Vandenberg, N.; Dowling, J.; Holloway, L.; Haworth, A. A Review of Medical Image Data Augmentation Techniques for Deep Learning Applications. J. Med. Imaging Radiat. Oncol. 2021, 65(5), 545–563. [Google Scholar] [CrossRef] [PubMed]
  20. Erickson, B.J. Basic Artificial Intelligence Techniques: Machine Learning and Deep Learning. Radiol. Clin. North Am. 2021, 59(6), 933–940. [Google Scholar] [CrossRef] [PubMed]
  21. Shi, L.; Wang, X.C.; Wang, Y.S. Artificial Neural Network Models for Predicting 1-Year Mortality in Elderly Patients with Intertrochanteric Fractures in China. Braz. J. Med. Biol. Res. 2013, 46(11), 993–999. [Google Scholar] [CrossRef] [PubMed]
  22. LeCun, Y. The Power and Limits of Deep Learning. Res. Technol. Manage. 2018, 61, 22–27. [Google Scholar] [CrossRef]
  23. Egger, J.; Gsaxner, C.; Pepe, A.; et al. Medical Deep Learning-A Systematic Meta-Review. Comput. Methods Programs Biomed. 2022, 221, 106874. [Google Scholar]
  24. Jiang, H.; Diao, Z.; Shi, T.; et al. A Review of Deep Learning-Based Multiple-Lesion Recognition from Medical Images: Classification, Detection, and Segmentation. Comput. Biol. Med. 2023, 157, 106726. [Google Scholar]
  25. Anwar, S.M.; Majid, M.; Qayyum, A.; Awais, M.; Alnowami, M.; Khan, M.K. Medical Image Analysis Using Convolutional Neural Networks: A Review. J. Med. Syst. 2018, 42(11), 226. [Google Scholar] [CrossRef] [PubMed]
  26. Chen, K.; Stotter, C.; Klestil, T.; Nehrer, S. Artificial Intelligence in Orthopedic Radiography Analysis: A Narrative Review. Diagnostics (Basel). 2022, 12(9), 2235. [Google Scholar] [CrossRef] [PubMed]
  27. Shi, L.; Wang, X.C.; Wang, Y.S. Artificial Neural Network Models for Predicting 1-Year Mortality in Elderly Patients with Intertrochanteric Fractures in China. Braz. J. Med. Biol. Res. 2013, 46(11), 993-999. [Note: Duplicate Reference].
  28. Salimi, M.; Parry, J.A.; Shahrokhi, R.; Mosalamiaghili, S. Application of Artificial Intelligence in Trauma Orthopedics: Limitation and Prospects. World J. Clin. Cases. 2023, 11(18), 4231–4240. [Google Scholar] [CrossRef] [PubMed]
  29. Haug, C.J.; Drazen, J.M. Artificial Intelligence and Machine Learning in Clinical Medicine. N. Engl. J. Med. 2023, 388(13), 1201–1208. [Google Scholar] [CrossRef] [PubMed]
  30. Familiari, F.; Galasso, O.; Massazza, F.; et al. Artificial Intelligence in the Management of Rotator Cuff Tears. Int. J. Environ. Res. Public Health. 2022, 19(24), 16779. [Google Scholar] [CrossRef] [PubMed]
  31. Bakhsh, W.; Nicandri, G. Anatomy and Physical Examination of the Shoulder. Sports Med. Arthrosc. Rev. 2018, 26(3), e10–e22. [Google Scholar] [CrossRef] [PubMed]
  32. Wessel, L.E.; Eliasberg, C.D.; Bowen, E.; Sutton, K.M. Shoulder and Elbow Pathology in the Female Athlete: Sex-specific Considerations. J. Shoulder Elb. Surg. 2021, 30(5), 977–985. [Google Scholar] [CrossRef] [PubMed]
  33. Kim, Y.; Choi, D.; Lee, K.J.; et al. Ruling out Rotator Cuff Tear in Shoulder Radiograph Series Using Deep Learning: Redefining the Role of Conventional Radiograph. Eur. Radiol. 2020, 30(5), 2843–2852. [Google Scholar] [CrossRef] [PubMed]
  34. Kang, Y.; Choi, D.; Lee, K.J.; Oh, J.H.; Kim, B.R.; Ahn, J.M. Evaluating Subscapularis Tendon Tears on Axillary Lateral Radiographs Using Deep Learning. Eur. Radiol. 2021, 31(12), 9408–9417. [Google Scholar] [CrossRef] [PubMed]
  35. Iio, R.; Ueda, D.; Matsumoto, T.; et al. Deep Learning-based Screening Tool for Rotator Cuff Tears on Shoulder Radiography. J. Orthop. Sci. Published online May 24, 2023.
  36. Lin, D.J.; Schwier, M.; Geiger, B.; et al. Deep Learning Diagnosis and Classification of Rotator Cuff Tears on Shoulder MRI.
  37. Yao, J.; Chepelev, L.; Nisha, Y.; Sathiadoss, P.; Rybicki, F.J.; Sheikh, A.M. Evaluation of a Deep Learning Method for the Automated Detection of Supraspinatus Tears on MRI. Skeletal Radiol. 2022, 51(9), 1765–1775. [Google Scholar] [CrossRef] [PubMed]
  38. Guo, D.; Liu, X.; Wang, D.; Tang, X.; Qin, Y. Development and clinical validation of deep learning for automatic diagnosis of supraspinatus tears." J Orthop Surg Res 2023, 18(1), 426.
  39. Lee, S.H.; Lee, J.; Oh, K.S.; et al. Automated 3-dimensional MRI Segmentation for the Posterosuperior Rotator Cuff Tear Lesion Using Deep Learning Algorithm. PLoS One. 2023, 18(5), e0284111. [Google Scholar]
  40. Shim, E.; Kim, J.Y.; Yoon, J.P.; et al. Automated Rotator Cuff Tear Classification Using 3D Convolutional Neural Network. Sci. Rep. 2020, 10(1), 15632. [Google Scholar] [CrossRef] [PubMed]
  41. Hahn, S.; Yi, J.; Lee, H.J.; et al. Image Quality and Diagnostic Performance of Accelerated Shoulder MRI With Deep Learning-Based Reconstruction. AJR Am. J. Roentgenol. 2022, 218(3), 506–516. [Google Scholar] [CrossRef]
  42. Liu, J.; Li, W.; Li, Z.; et al. Magnetic Resonance Shoulder Imaging Using Deep Learning-based Algorithm. Eur. Radiol. 2023, 33(7), 4864–4874. [Google Scholar] [CrossRef] [PubMed]
  43. Kaniewska, M.; Deininger-Czermak, E.; Getzmann, J.M.; Wang, X.; Lohezic, M.; Guggenberger, R. Application of Deep Learning-based Image Reconstruction in MR Imaging of the Shoulder Joint to Improve Image Quality and Reduce Scan Time. Eur. Radiol. 2023, 33(3), 1513–1525. [Google Scholar] [CrossRef] [PubMed]
  44. Nunna, B. Jr.; Parihar, P.; Wanjari, M.; Shetty, N.; Bora, N. High-Resolution Imaging Insights into Shoulder Joint Pain: A Comprehensive Review of Ultrasound and Magnetic Resonance Imaging (MRI). Cureus 2023, 15(11), e48974. [Google Scholar] [CrossRef] [PubMed]
  45. Lee, K.; Kim, J.Y.; Lee, M.H.; Choi, C.H.; Hwang, J.Y. Imbalanced Loss-Integrated Deep-Learning-Based Ultrasound Image Analysis for Diagnosis of Rotator-Cuff Tear. Sensors (Basel) 2021, 21(6), 2214. [Google Scholar] [CrossRef] [PubMed]
  46. Ho, T.T.; Kim, G.T.; Kim, T.; Choi, S.; Park, E.K. Classification of Rotator Cuff Tears in Ultrasound Images Using Deep Learning Models. Med Biol Eng Comput 2022, 60(5), 1269–1278. [Google Scholar] [CrossRef]
  47. Ro, K.; Kim, J.Y.; Park, H.; et al. Deep-learning Framework and Computer Assisted Fatty Infiltration Analysis for the Supraspinatus Muscle in MRI. Sci. Rep. 2021, 11(1), 15065. [Google Scholar] [CrossRef]
  48. Goutallier, D.; Postel, J.M.; Bernageau, J.; Lavau, L.; Voisin, M.C. Fatty Infiltration of Disrupted Rotator Cuff Muscles. Rev. Rhum. Engl. Ed. 1995, 62(6), 415–422. [Google Scholar]
  49. Kim, J.Y.; Ro, K.; You, S.; et al. Development of an Automatic Muscle Atrophy Measuring Algorithm to Calculate the Ratio of Supraspinatus in Supraspinous Fossa Using Deep Learning. Comput Methods Programs Biomed 2019, 182, 105063. [Google Scholar]
  50. Taghizadeh, E.; Truffer, O.; Becce, F.; et al. Deep Learning for the Rapid Automatic Quantification and Characterization of Rotator Cuff Muscle Degeneration from Shoulder CT Datasets. Eur. Radiol. 2021, 31(1), 181–190. [Google Scholar] [CrossRef] [PubMed]
  51. Medina, G.; Buckless, C.G.; Thomasson, E.; Oh, L.S.; Torriani, M. Deep Learning Method for Segmentation of Rotator Cuff Muscles on MR Images. Skeletal Radiol. 2021, 50(4), 683–692. [Google Scholar] [CrossRef] [PubMed]
  52. Li, C.; Alike, Y.; Hou, J.; et al. Machine Learning Model Successfully Identifies Important Clinical Features for Predicting Outpatients with Rotator Cuff Tears. Knee Surg Sports Traumatol Arthrosc 2023, 31(7), 2615–2623. [Google Scholar] [CrossRef] [PubMed]
  53. Potty, A.G.; Potty, A.S.R.; Maffulli, N.; et al. Approaching Artificial Intelligence in Orthopaedics: Predictive Analytics and Machine Learning to Prognosticate Arthroscopic Rotator Cuff Surgical Outcomes. J Clin Med 2023, 12(6), 2369. [Google Scholar] [CrossRef] [PubMed]
  54. Burns, D.M.; Leung, N.; Hardisty, M.; Whyne, C.M.; Henry, P.; McLachlin, S. Shoulder Physiotherapy Exercise Recognition: Machine Learning the Inertial Signals from a Smartwatch. Physiol Meas 2018, 39(7), 075007. [Google Scholar] [CrossRef] [PubMed]
  55. Croci, E.; Hess, H.; Warmuth, F.; et al. Fully Automatic Algorithm for Detecting and Tracking Anatomical Shoulder Landmarks on Fluoroscopy Images with Artificial Intelligence. Eur Radiol 2024, 34(1), 270–278. [Google Scholar]
  56. Kompel, A.J.; Li, X.; Guermazi, A.; Murakami, A.M. Radiographic Evaluation of Patients with Anterior Shoulder Instability. Curr Rev Musculoskelet Med 2017, 10(4), 425–433. [Google Scholar] [CrossRef] [PubMed]
  57. Stillwater, L.; Koenig, J.; Maycher, B.; Davidson, M. 3D-MR vs. 3D-CT of the Shoulder in Patients with Glenohumeral Instability. Skeletal Radiol 2017, 46(3), 325–331. [Google Scholar] [CrossRef] [PubMed]
  58. Cantarelli Rodrigues, T.; Deniz, C.M.; Alaia, E.F.; et al. Three-dimensional MRI Bone Models of the Glenohumeral Joint Using Deep Learning: Evaluation of Normal Anatomy and Glenoid Bone Loss. Radiol Artif Intell 2020, 2(5), e190116. [Google Scholar] [CrossRef] [PubMed]
  59. Wei, J.; Li, D.; Sing, D.C.; et al. Detecting Upper Extremity Native Joint Dislocations Using Deep Learning: A Multicenter Study. Clin Imaging 2022, 92, 38–43. [Google Scholar] [CrossRef] [PubMed]
  60. Till, S.E.; Lu, Y.; Reinholz, A.K.; et al. Artificial Intelligence Can Define and Predict the "Optimal Observed Outcome" After Anterior Shoulder Instability Surgery: An Analysis of 200 Patients With 11-Year Mean Follow-Up. Arthrosc Sports Med Rehabil 2023, 5(4), e100773. [Google Scholar] [CrossRef] [PubMed]
  61. Chianca, V.; Albano, D.; Messina, C.; et al. Rotator Cuff Calcific Tendinopathy: From Diagnosis to Treatment. Acta Biomed 2018, 89(1-S), 186-196.
  62. Bechay, J.; Lawrence, C.; Namdari, S. Calcific Tendinopathy of the Rotator Cuff: A Review of Operative Versus Nonoperative Management. Phys Sportsmed 2020, 48(3), 241–246. [Google Scholar] [CrossRef] [PubMed]
  63. Dong, S.; Li, J.; Zhao, H.; et al. Risk Factor Analysis for Predicting the Onset of Rotator Cuff Calcific Tendinitis Based on Artificial Intelligence. Comput Intell Neurosci 2022, 2022, 8978878. [Google Scholar]
  64. Vassalou, E.E.; Klontzas, M.E.; Marias, K.; Karantanas, A.H. Predicting Long-term Outcomes of Ultrasound-guided Percutaneous Irrigation of Calcific Tendinopathy with the Use of Machine Learning. Skeletal Radiol 2022, 51(2), 417–422. [Google Scholar] [CrossRef] [PubMed]
  65. Jawa, A.; Burnikel, D. Treatment of Proximal Humeral Fractures: A Critical Analysis Review. JBJS Rev 2016, 4(1), e2. [Google Scholar] [CrossRef]
  66. Chung, S.W.; Han, S.S.; Lee, J.W.; et al. Automated Detection and Classification of the Proximal Humerus Fracture by Using Deep Learning Algorithm. Acta Orthop. 2018, 89(4), 468–473. [Google Scholar] [CrossRef] [PubMed]
  67. Magnéli, M.; Ling, P.; Gislén, J.; et al. Deep Learning Classification of Shoulder Fractures on Plain Radiographs of the Humerus, Scapula and Clavicle. PLoS One. 2023, 18(8), e0289808. [Google Scholar] [CrossRef]
  68. Dipnall, J.F.; Lu, J.; Gabbe, B.J.; et al. Comparison of State-of-the-Art Machine and Deep Learning Algorithms to Classify Proximal Humeral Fractures Using Radiology Text. Eur J Radiol. 2022, 153, 110366. [Google Scholar] [CrossRef] [PubMed]
  69. Guan, H.; Wu, Q.; Zhou, Y.; et al. A Retrospective Study of Ultrasound-Guided Intervention for Frozen Shoulder in the Frozen Stage. Front Surg. 2022, 9, 998590. [Google Scholar] [CrossRef] [PubMed]
  70. Yu, L.; Li, Y.; Wang, X.F.; Zhang, Z.Q. Analysis of the Value of Artificial Intelligence Combined with Musculoskeletal Ultrasound in the Differential Diagnosis of Pain Rehabilitation of Scapulohumeral Periarthritis. Medicine (Baltimore). 2023, 102(14), e33125. [Google Scholar] [CrossRef] [PubMed]
  71. Shu, Y.C.; Lo, Y.C.; Chiu, H.C.; et al. Deep Learning Algorithm for Predicting Subacromial Motion Trajectory: Dynamic Shoulder Ultrasound Analysis. Ultrasonics. 2023, 134, 107057. [Google Scholar] [CrossRef] [PubMed]
  72. Jiang, H.; Chen, L.; Zhao, Y.J.; Lin, Z.Y.; Yang, H. Machine Learning-Based Ultrasomics for Predicting Subacromial Impingement Syndrome Stages. J Ultrasound Med. 2022, 41(9), 2279–2285. [Google Scholar] [CrossRef] [PubMed]
  73. Chang, KV.; Wu, WT.; Özçakar, L. Association of bicipital peritendinous effusion with subacromial impingement: A dynamic ultrasonographic study of 337 shoulders. " Sci Rep 2016, 6, 38943. [Google Scholar] [CrossRef]
  74. Chang, KV.; Chen, WS.; Wang, TG.; Hung, CY.; Chien, KL. "Associations of sonographic abnormalities of the shoulder with various grades of biceps peritendinous effusion (BPE)." Ultrasound Med Biol 2014, 40(2), 313-321.
  75. Lin, B.S.; Chen, J.L.; Tu, Y.H.; et al. Using Deep Learning in Ultrasound Imaging of Bicipital Peritendinous Effusion to Grade Inflammation Severity. IEEE J Biomed Health Inform. 2020, 24(4), 1037–1045. [Google Scholar] [CrossRef] [PubMed]
  76. Grauhan, N.F.; Niehues, S.M.; Gaudin, R.A.; et al. Deep Learning for Accurately Recognizing Common Causes of Shoulder Pain on Radiographs. Skeletal Radiol. 2022, 51(2), 355–362. [Google Scholar] [CrossRef] [PubMed]
  77. Chang, M.; Canseco, J.A.; Nicholson, K.J.; Patel, N.; Vaccaro, A.R. The Role of Machine Learning in Spine Surgery: The Future Is Now. Front Surg. 2020, 7, 54. [Google Scholar] [CrossRef] [PubMed]
  78. Loftus, T.J.; Tighe, P.J.; Filiberto, A.C.; et al. Artificial Intelligence and Surgical Decision-making. JAMA Surg. 2020, 155(2), 148–158. [Google Scholar] [CrossRef] [PubMed]
  79. Kumar, V.; Patel, S.; Baburaj, V.; Vardhan, A.; Singh, P.K.; Vaishya, R. Current Understanding on Artificial Intelligence and Machine Learning in Orthopaedics - A Scoping Review. J Orthop. 2022, 34, 201–206. [Google Scholar] [CrossRef] [PubMed]
  80. Myers, T.G.; Ramkumar, P.N.; Ricciardi, B.F.; Urish, K.L.; Kipper, J.; Ketonis, C. Artificial Intelligence and Orthopaedics: An Introduction for Clinicians. J Bone Joint Surg Am. 2020, 102(9), 830–840. [Google Scholar] [CrossRef] [PubMed]
Figure 1. The relationship of AI, ML, DL, and CNNs.
Figure 1. The relationship of AI, ML, DL, and CNNs.
Preprints 103410 g001
Figure 2. AI-related publications in the field of shoulder pathology from 2010 to 2023.
Figure 2. AI-related publications in the field of shoulder pathology from 2010 to 2023.
Preprints 103410 g002
Figure 3. The flowchart for literature screening.
Figure 3. The flowchart for literature screening.
Preprints 103410 g003
Table 1. Glossary of key terms.
Table 1. Glossary of key terms.
Term Definition
Area under the curve (AUC) A valuable metric for evaluating the performance of binary classification models, which provides a concise measure of the model's ability to discriminate between positive and negative classes and is widely used for comparing and assessing the overall performance of predictive models.
Class activation map (CAM) A technique that generates a heatmap to visualize the important regions of an input image for predicting a specific class in a deep convolutional neural network, and helps in interpreting model decisions and understanding the features learned by the network during the classification process.
DenseNet121 DenseNet is a deep learning architecture characterized by dense connectivity patterns, where each layer receives direct input from all preceding layers, leading to improved feature reuse, parameter efficiency, and gradient flow during training. DenseNet121 has 121 layers in total and is commonly used for image classification tasks on the ImageNet dataset.
Dice similarity coefficient (DSC) A statistical measure used to quantify the similarity between two sets, often employed in the context of image segmentation to evaluate the overlap between predicted and ground truth masks, ranging from 0 to 1, with 1 indicating perfect overlap between the two sets and 0 indicating no overlap.
F1-score A performance metric used to evaluate the accuracy of binary classification models, which is to predict one of two possible outcomes based on input data, with values ranging from 0 to 1, where 1 indicates perfect precision and recall, and 0 indicates poor performance.
Gradient-weighted class activation mapping (Grad-CAM) A technique that extends the Class Activation Map (CAM) approach to provide better visual explanations for the predictions made by deep convolutional neural networks.
nnU-Net An extension of the original U-Net architecture, and is a framework for 3D biomedical image segmentation that aims to provide a standardized and automated way to train and evaluate deep learning models on various datasets.
Otsu thresholding technique A image processing technique used for automatic image thresholding, which goal of thresholding is to separate objects or regions of interest from the background in an image by converting it into a binary image (black and white).
Segmentation Model Adopting a pre-trained Classification Architecture (SMART-CA) A deep learning algorithm that improves the efficiency and accuracy of CNNs by adaptively refining the network architecture during training based on the complexity of the input data, which uses a self-modulating mechanism and a measure of network capacity called the Channel Attention Score to achieve this.
Shapley plot A valuable tool for explaining and interpreting machine learning models by attributing the model's predictions to individual features, help data scientists and stakeholders gain insights into the model's decision-making process and understand the significance of each feature in driving the model's output.
U-Net A convolutional neural network architecture that was designed for biomedical image segmentation tasks. The U-Net architecture consists of a contracting path to capture context and a symmetric expanding path to enable precise localization.
Voxception-ResNet (VRN) A hybrid neural network architecture that merges the strengths of Voxception and ResNet to tackle tasks that require processing 3D image data.
XGBoost model A versatile and efficient algorithm that excels in handling structured/tabular data and is widely used for tasks such as regression, classification, ranking, and more.
Youden index A single statistic that captures the performance of a binary classification test, which takes into account both sensitivity and specificity of the test to provide an overall measure of its accuracy with 1 indicating perfect performance and 0 indicating no discriminatory power.
Table 2. AI applications in the rotator cuff tears.
Table 2. AI applications in the rotator cuff tears.
Author (year) Input Feature Models Dataset Type of outcome Results
Kim et al. 2020 X-ray DL(1) 6,793 radiograph series Rule out significant RCTs(2) The sensitivity, NPV(3), and LR−(4) were 97.3%, 96.6%, and 0.06, respectively.
Kang et al. 2021 X-ray DL 2,779 radiograph series Rule out subscapularis tendon tears The AUC(4), sensitivity, NPV, and LR- were 0.83 91.4%, 90.4%, and 0.21 in Test Set 1, and 0.82 90.2%, 89.5%, and 0.21 in Test Set 2, respectively.
Iio et al. 2023 X-ray DL 2,803 radiograph series Rule out significant RCTs The sensitivity, NPV, and LR- were 94.5%, 96.2%, and 0.10, respectively.
Lin et al. 2023 MRI DL 11,925 MRI scans Detection and classification of RCTs The AUCs for supraspinatus, infraspinatus, and subscapularis tendon tears were 0.93, 0.89, and 0.90, respectively. The model performed best for full-thickness supraspinatus, infraspinatus, and subscapularis tears with AUCs of 0.98, 0.99, and 0.95, respectively.
Yao et al. 2022 MRI DL 200 MRI scans Detection, and segmentation of supraspinatus tears The sensitivity and specificity were 85.0%, and 85.0%, respectively. The AUC for classification was 0.943, DSC[5] for segmentation was 0.814.
Guo et al. 2023 MRI 2D CNN 701 MRI scans for training and 69 MRI scans for clinical validation Detection of supraspinatus tears The model showed high F1-scores and sensitivity on both surgery and internal test sets. Subgroup analyses confirmed its robustness across tear degrees and MRI field strengths.
Lee et al. 2020 MRI 3D U-Net CNN[6] 303 MRI scans Segmentation of RCTs The model reached a 94.3% of DSC, 97.1% of sensitivity, 95.0% of specificity, 84.9% of precision, 90.5% of F1-score, and Youden index of 91.8%.
Shim et al. 2020 MRI VRN[7]-based 3D CNN 2,124 MRI scans Detect the presence or absence of RCTs, classify the tear size, and provide 3D visualization of the tear location. The model outperformed orthopedists in binary accuracy (92.5%vs. 76.4% and 68.2%), top-1 accuracy (69.0%vs. 45.8%and 30.5%), top-1± 1 accuracy (87.5%vs. 79.8% and 71.0%), sensitivity (0.94 vs. 0.86 and 0.90), and specificity (0.90 vs. 0.58 and 0.29). The generated 3D CAM(8) provided effective information regarding the 3D location and size of the tear.
Lee et al. 2021 Ultrasound imaging CNN, denoted as SMART-CA(9) 1400 ultrasound images Segmentation of RCTs The precision, recall, and DSC were 0.604% (+38.4%), 0.942%(+14.0%), and 0.736% (+38.6%), respectively.
Ho et al. 2022 Ultrasound imaging CNN (five models) 194 ultrasound
images
Segmentation of RCTs DenseNet121 demonstrated the best performance with 88.2% accuracy, 93.8% sensitivity, 83.6% specifcity, and AUC score of 0.832.
Ro et al. 2021 MRI DL 240 MRI scans Segmentation of the supraspinatus muscle
and fossa, and calculation of the amount of fatty infiltration of the supraspinatus muscle
The mean DSC, accuracy, sensitivity, specificity, and relative area difference for the segmented lesion were 0.97, 99.84, 96.89, 99.92, and 0.07, respectively, for the supraspinatus fossa and 0.94, 99.89, 93.34, 99.95, and 2.03, respectively, for the supraspinatus muscle.
Kim et al. MRI DL 240 MRI scans Segmentation of the supraspinatus muscle
and fossa
The DSC is 0.9718± 0.012 in the fossa region and 0.9463± 0.047 in the muscle region.
Taghizadeh et al. 2020 CT CNN 103 CT scans Segmentation of RC(10) muscle, and calculation of muscle atrophy and degeneration. Average DSC for muscle segmentations (88%± 9%) and manually by human raters (89%± 6%) were comparable. The model provided good-very good estimates of muscle atrophy (R2= 0.87), fatty infiltration (R2=0.91), and overall muscle degeneration (R2 = 0.91)
Medina et al. 2020 MRI U-Net-based CNN 258 cases of model A for (Y-view selection), and 1048 sagittal T1 Y-views for model (muscle segmentation) Segmentation of RC muscles on a Y-view Model A showed top-3 accuracy > 98% to select an appropriate Y-view. Model B produced accurate RC muscle segmentations with mean DSC > 0.93.
Li et al. 2023 Questionnaires and physical examinations ML (six models) 1684 patients Identify best model, and important clinical variables for predicting patients with
RCTs in outpatient
settings.
The XGBoost model showed superior performance with accuracy, AUC, and Brier scores of 0.85, 0.92, and 0.15, respectively. The most important variables were Jobe test, Bear hug test, and age for prediction, with mean SHAP[11] values of 1.458, 0.950, and 0.790, respectively.
Potty et al. 2023 Patient-related and surgical-related factors ML 631 patients Identify important clinical variables for predicting patient with
RCTs’ repairing.
The algorithm predicted post-operative outcomes accurately. The most essential variables were pre-operative ASES[12] score, pre-operative pain score, BMI[13], age, and tendon quality.
(1)Deep learning, (2)Rotator cuff tears, (3)Negative predictive value, (4) Area under the curve, (5)Dice similarity coefficient, (6)Convolutional Neural Network, (7)Voxception-ResNet, Rotator cuff, (8)Class activation map, (9)Segmentation Model Adopting a pre-trained Classification Architecture, [10]Rotator cuff, [11]Shapley additive explanation, [12]American Shoulder and Elbow Surgeons, [13]Body mass index.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated