Hamstring Strain Injury Risk in Soccer: An Exploratory, Hypothesis-Generating Prediction Model

Kekelekis Afxentios; Rabiu Muazu Musa; Pantelis Nikolaidis; Filipe Manuel Clemente; Eleftherios Kellis

doi:10.20944/preprints202509.1699.v1

Submitted:

19 September 2025

Posted:

19 September 2025

You are already at the latest version

Abstract

Background: Hamstring strain injuries (HSI) are common in soccer and difficult to predict. Machine learning and regression approaches may identify novel strength-related predictors, but model development requires transparent reporting. Objective: To develop and internally validate a prediction model for hamstring strain injuries in amateur soccer players using preseason strength and clinical measures. Study Design: Prospective cohort study. Methods: This prospective cohort study included 120 amateur male soccer players monitored across one competitive season (30 weeks). Baseline predictors were age, body mass index, prior injury, and bilateral isometric hip and knee strength variables measured with a handheld dynamometer. The outcome was player-level hamstring injury status (≥1 HSI vs none). Twenty candidate predictors were reduced to 10 via symmetrical uncertainty feature ranking. A logistic regression model was trained (n=83 players) with nested four-fold cross-validation and tested on an independent hold-out set (n=37 players). Model performance was evaluated using the area under the ROC curve (AUC), calibration slope and intercept, and confusion matrices. Results: Twenty-one players sustained ≥1 HSI (32 events; 28% reinjuries). With 10 predictors and 21 events, the events-per-variable ratio was 2.1, below recommended thresholds, indicating risk of overfitting. On the test set (5 injured, 32 uninjured), the model achieved an accuracy 64.9%, AUC 0.68 (95% CI 0.52–0.84), calibration slope 0.85, and intercept –0.12. Sensitivity was 60% and specificity 65.6%. Dominant-leg hip abduction strength was the only statistically significant predictor (OR=0.82, 95% CI 0.70–0.96), though stability analyses identified previous hamstring injury as the most consistent contribution despite significance in regression due to limited events. Conclusion: Previous hamstring injury remained the strongest predictor of future injury risk, while reduced dominant-leg hip abduction strength emerged as a candidate risk factor but demonstrated instability under resampling. Neither age nor hamstring isometric strength were significant predictors in this cohort. Model discrimination was modest, calibration indicated mild overfitting, and overall risk of bias was high. This study represents a TRIPOD Category 2 prediction model development without external validation. Findings should therefore be considered exploratory and hypothesis-generating, requiring confirmation in larger, methodologically robust cohorts.

Keywords:

hamstring injury

;

prediction model

;

logistic regression

;

TRIPOD

;

soccer

Subject:

Medicine and Pharmacology - Orthopedics and Sports Medicine

1. Introduction

Hamstring strain injury remains prevalent across sporting activities that involve sprinting, jumping, acceleration, deceleration, and rapid change in direction, resulting in significant time-loss from sport. [1,2] Despite excessive efforts in the area of hamstring injury prevention, [3] hamstring strain injury rates increased from 12% to 24% during a 21-year UEFA Elite club injury surveillance, reporting a frequency of 1.7 injuries per 1000 h of total play, while match injury rates were 10 times higher than training (4.99/1000 h vs 0.52/1000 h; RR 9.67, 95% CI 8.93 to 10.47) with a median time-loss of 13 days [4] resulting in high cost for both the athletes and the teams. [5]

A variety of intrinsic and extrinsic risk factors for HSI have been proposed. Non-modifiable factors include older age and previous injury, both of which have been consistently reported as strong predictors of future strain. [6] In fact, prior HSI has been identified as the single most robust risk factor, with relative risk estimates ranging from 2.3 to 6.1 in different cohorts, [6] The mechanisms underlying this association are thought to include incomplete recovery of muscle architecture, persistent neuromuscular inhibition, and structural changes within the myotendinous junction. Modifiable factors, such as hamstring strength, [7] flexibility, [8] fascicle length, [9] endurance, [10] and fatigue [11] have been the focus of many preseason screening batteries, but their predictive value has been inconsistent. [12,13,14] Several prospective studies have linked eccentric hamstring weakness to increased risk. [15] whereas others have failed to demonstrate a significant association [16] Residual weakness after an initial injury has been suggested to explain the high reinjury rates, [4] These discrepancies emphasize the multifactorial nature of HSI and the limitations of single-parameter screening.

While much of the literature has focused on the hamstrings themselves, attention has increasingly turned to adjacent muscle groups that may indirectly influence hamstring loading. The hip abductors have emerged as a plausible candidate, given their central role in lumbopelvic stability during running and cutting. [17,18,19] Functionally, the hip abductors stabilize the pelvis in the frontal plane and help maintain lower-limb alignment during stance. [20,21] Weakness in this group may allow excessive pelvic drop or anterior tilt, which alters hamstring length–tension relationships, particularly during the late swing phase of sprinting when hamstrings are lengthening while generating high force, [22] Poor abductor strength may also promote excessive knee valgus and increase iliotibial band tension, further disrupting energy transfer across the pelvis and trunk. [23] These alterations could increase mechanical demand on the hamstrings and predispose players to strain injuries. [24,25] Although the influence of hip abductors has been suggested in theoretical and biomechanical studies, prospective evidence linking abductor strength to HSI risk remains scarce.

Given the uncertainty around both direct and indirect strength measures, prediction models have been developed to integrate multiple risk factors. [26] Prediction requires multivariate approaches that can account for interactions among risk factors. [27] Traditional regression models have long been used to identify associations, but their translation into reliable prediction tools has been limited. [28] Several HSI prediction models have been published, often incorporating age, previous injury, and strength measures. Some reported apparently high discrimination (AUC >0.80) [29] but systematic reviews and methodological critiques have highlighted important shortcomings: most models relied on small sample sizes, single baseline measures, and lacked external or temporal validation [30,31] . Many failed to assess calibration, used redundant predictors, and suffered from low events-per-variable ratios, which increase the risk of overfitting and optimism in reported performance. [32,33,34] Broader clinical research has shown that machine learning rarely outperforms logistic regression in such contexts. [31] In sports medicine, this has been reflected in inconsistent results: while some studies report promising classification accuracy [33,34] others find poor generalizability and instability of predictors when applied to new cohorts. [35,36]

These limitations underline the importance of developing prediction models that are transparent, hypothesis-generating, and compliant with established reporting standards. Frameworks such as TRIPOD and PROBAST provide clear guidance for reporting, validation, and risk of bias assessment. [37] The recent TRIPOD-AI extension further emphasizes the need to address issues of overfitting, calibration, and predictor stability when applying machine learning or penalized regression approaches in biomedical research. By situating new models within this framework, researchers can avoid overstating findings and contribute incremental knowledge rather than definitive screening tools.

Thus, while hip abductors are biomechanically plausible contributors to hamstring injury, their role as predictors has not been tested prospectively in large amateur cohorts. To date, no study has prospectively examined the role of hip abductor strength as a predictor of HSI in male amateur soccer players, who represent the largest playing population worldwide but remain underrepresented in research. Amateur players often lack the medical and conditioning support available in elite environments, which may increase their vulnerability to injury and alter the relevance of different risk factors.

The objectives of this study were: (1) to develop and internally validate a multivariable prediction model for hamstring strain injuries in amateur soccer players, using preseason measures of isometric hip adduction, hip abduction, hip flexion, and hamstring strength; and relative ratios (2) to examine whether these strength variables, alongside previous injury history, contribute meaningfully to predicting injury risk; and (3) to evaluate the predictive accuracy of logistic regression and machine learning classifiers applied to the dataset. As a secondary aim, we also documented self-reported mechanisms of injury during follow-up, while acknowledging that these accounts were not objectively verified and should therefore be interpreted with caution. In line with TRIPOD+AI guidance, this represents a Category 2 study (model development without external validation). Given the small number of outcome events, the model should be considered exploratory and hypothesis-generating rather than confirmatory

2. Methods

2.1. Study Design

This study followed an observational cohort design spanning of 30 weeks (August 2018 to April of 2019) and aligning with the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines. [38] Our research group has previously conducted three studies using the same database- two on the epidemiology of musculoskeletal injuries [1,39,40] and one groin injury risk factors [40] . Building on this foundation, the present investigation focused on hamstring strain injury. Reporting follows the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) and TRIPOD-AI extensions. [37] A completed TRIPOD/TRIPOD-AI checklist with item-level page references is provided in Supplementary Material 3. Methodological quality was assessed using the PROBAST and PROBAST-AI frameworks, which evaluate bias across four domains: participants, predictors, outcome, and analysis. The results of this assessment are reported in Supplementary Material 4.

At baseline, all players underwent bilateral isometric strength testing of the hip adductors, abductors, flexors, and knee flexors. Players were prospectively followed throughout the competitive season, with injury incidence and exposure systematically recorded.

2.2. Participants

A priori sample size analysis was conducted using G*Power (latest ver. 3.1.9.7; Heinrich-Heine-Universität Düsseldorf, Düsseldorf, Germany) [41] for logistic regression with a small-to-moderate effect size (Cohen’s f² = 0.05–0.25), α = 0.05, and power = 0.80. A sample of 110–130 players were deemed sufficient, consistent with prior injury prediction studies. [42,43,44,45] . However, in prediction model development, contemporary guidance emphasizes the number of outcome events per predictor parameter (EPV) and model optimism rather than total sample size [46] . Ten predictors entered the final model, yielding an EPV ≈ 2.1—well below recommended thresholds (≥10–20 EPV). This limitation substantially increases the risk of overfitting and instability; therefore, the analysis is presented as exploratory and hypothesis-generating only. Eligible participants were male amateur soccer players ≥ 14 years, competing in the regional amateur soccer league, and free of musculoskeletal injury for ≥ 3 months before baseline. A total of 253 male players from 11 teams were screened during the 2018/19 off-season period (June to August). Of these, 176 players initially agreed to participate in the study, but 46 players were excluded due to non-compliance with the exposure limitations or inability to follow the data collection procedures. Ten players were excluded due to injury before the pre-season. The final cohort included 120 players. All trained in a standardized weekly micro cycle, consisting of four training sessions (Monday to Friday) designed to progressively build physical load, typically culminating in an official competitive match on the weekend (Saturday or Sunday). Written informed consent was obtained from all participants, and the study protocol was approved by the Aristotle University Ethical Committee (ERC-012/2019) in accordance with the Declaration of Helsinki.

2.3. Data Collection, Testing Protocol and Injury Registration

Baseline testing was performed in August 2018 at the medical facilities of the participating clubs. To standardize recovery status, all measurements were conducted on the 14th training day following a rest day, between 17:00 and 18:30. Isometric strength of the hip adductors, hip abductors, hip flexors, and knee flexors was assessed bilaterally using a handheld dynamometer (KFORCE Muscle Controller, K-Invent, Montpellier, France). This device has demonstrated high intra-rater reliability and validity in musculoskeletal testing, with intraclass correlation coefficients reported above 0.80 for force and torque reliability and above 0.79 for validity. [47,48] Testing followed a standardized break-test protocol: each muscle group was tested twice with a 30-second rest interval, and the higher maximal voluntary contraction value was recorded. A two-minute rest period was provided between tests of different muscle groups to minimize fatigue. Muscle groups were tested in standardized positions: hip adduction and hip flexion in supine, hip abduction in side-lying, and knee flexion in prone with the knee at 15° of flexion. Hip extensor testing was not performed due to field-based logistical constraints. Accurate assessment of extensor force, particularly from the gluteus maximus, requires precise joint stabilization, torque alignment, and prolonged examiner involvement, which would have substantially increased testing duration in this large-cohort setting. Similar limitations in field-based hip extensor testing have been described previously. [49] From raw strength assessments, absolute values and selected ratios were derived, resulting in a total of 20 candidate predictors (Supplementary Material 5). All strength variables were initially recorded in kilograms (kg) and normalised to body mass, yielding values expressed in Newton-metres per kilogram (Nm·kg⁻¹). These continuous predictors were used in modelling. Two physiotherapists performed all testing: one served as lead examiner and the other assisted with participant positioning and stabilization. Both testers and all players were blinded to the recorded results for the duration of the study. Additional methodological detail and visual illustrations of the testing protocol are available in Supplementary Material 1. Injuries were registered prospectively according to the FIFA/UEFA consensus definition [50] : any musculoskeletal complaint causing ≥1 missed training session or match. Data were collected with a standardized surveillance form (Supplementary Material 2). Recorded items included date of injury, mechanism (sprinting, change of direction, or other), setting (training/match), reinjury status, and return-to-play duration.

2.4. Data Treatment and Statistical Analysis

All strength predictors were converted to Nm·kg⁻¹ and then min–max scaled. Symmetrical uncertainty was used to rank predictors; the top 10 were retained. To mitigate collinearity, variance inflation factors (VIF) were computed; when VIF > 5, correlated predictors were clustered and the most clinically interpretable retained. All preprocessing (scaling, ranking, VIF checks) was nested strictly within training folds to prevent data leakage. The modelling outcome was a player-level binary indicator of hamstring strain injury during the season (injured vs. not injured). Data were split into training (70%; n = 83) and independent test (30%; n = 37) sets using stratification by injury status. Within the training set, four-fold cross-validation with nested resampling was used for hyperparameter tuning and optimism estimation. The primary model was logistic regression with elastic-net regularization; exploratory comparators included k-nearest neighbours and support vector machines under the same framework. No missing baseline data were present in the analysis dataset. Hyperparameters are shown in Supplementary Material 6

2.5. Development of the Logistic Regression Model

Logistic regression with an elastic-net penalty was fit to classify players as injured vs. not injured. The l1_ratio was tuned by nested four-fold cross-validation within the training set. All preprocessing (min–max scaling, feature ranking, and VIF-based pruning) was performed only within training folds. Models were implemented in PyCaret (Spyder IDE) with complementary analyses in Orange v3.4.0 and XLSTAT v2014.

2.6. Model Performance

Planned performance evaluation included discrimination (AUC with 95% CIs), calibration (slope and intercept, plus calibration plots), and classification metrics (accuracy, sensitivity, specificity, precision, F1). Confusion matrices for both training and test sets are provided in Supplementary Material 7. Regression coefficients, odds ratios, and 95% CIs are presented in Supplementary Materials 8.

2.7. Stability Analyses

Model stability was examined using bootstrap resampling and permutation importance. Bootstrap resampling (200 iterations) refit the model on samples drawn with replacement from the training set and evaluated performance on the fixed test set. For permutation importance, each predictor was permuted 50 times in the test set while other variables were held constant; the mean change in AUC (ΔAUC) was recorded.

3. Results

Injury Incidence

Of the 120 participants (mean age 20.0 ± 6.9 years; BMI 22.5 ± 2.3 kg·m⁻²; height 1.77 ± 0.07 m; mass 70.7 ± 10.1 kg), 21 players sustained at least one hamstring strain injury during the season, for a total of 32 events (28% reinjuries). Most injuries occurred in the dominant limb (56.3%) and were attributed to sprinting activities (81.2%), based on player self-report without video or GPS confirmation. The majority were grade I strains, with a single grade II strain of the biceps femoris long head. Mean return-to-play duration was 9.3 ± 5.3 days (range 3–32). For prediction modelling, the outcome was analyzed at the player level (injured vs not injured).

Model Performance

On the training set (n = 83; 16 injured, 67 uninjured), the logistic regression model achieved an accuracy of 76.4% (64/83 correctly classified), with sensitivity 81.3% (13/16 injured) and specificity 76.1% (51/67 uninjured). On the independent test set (n = 37; 5 injured, 32 uninjured), accuracy declined to 64.9% (24/37 correct), with sensitivity 60% (3/5 injured) and specificity 65.6% (21/32 uninjured). The not-injured class comprised TN + FP = 21 + 11 = 32 players. Discrimination was modest, with AUC = 0.68 (95% CI 0.52–0.84) (Figure 1). Calibration analysis indicated mild overfitting with slight overestimation of risk (slope 0.85, intercept −0.12) (Figure 2).

To complement these split-sample results, Table 1 shows cross-validation performance metrics, with mean accuracy 69.9% (±3.8), AUC 0.79 (±0.06), sensitivity 0.70 (±0.07), precision 0.90 (±0.05), and F1 score 0.79 (±0.04). These results confirm moderate discrimination but limited sensitivity, reducing clinical utility.

Predictor Importance

Feature ranking identified ten of the original twenty variables as influential (Figure 3). In the final multivariable logistic regression, only dominant-leg hip abduction strength remained statistically significant (OR 0.82, 95% CI 0.70–0.96, p = 0.016), suggesting a protective association. Age, BMI, previous injury, and hamstring strength were not significant predictors (Table 2). The Hosmer–Lemeshow test (p = 0.98) and Nagelkerke R² (0.29) are reported for completeness but are interpreted cautiously, given the low events-per-variable ratio (2.1).

Stability Analyses

On the independent test set, the baseline logistic regression model achieved an AUC of 0.68 (95% CI 0.52-0.84). Bootstrap resampling (n = 200) yielded a mean AUC of 0.681 (SD 0.036; 2.5th–97.5th percentiles 0.600–0.745), indicating modest and uncertain discrimination consistent with the small number of events (Figure 4). Permutation importance confirmed previous hamstring injury as the strongest contributor to model performance (mean ΔAUC −0.032 ± 0.089), followed by dominant-leg hip abduction strength (ΔAUC −0.016 ± 0.039). Hamstring isometric strength exerted negligible influence (ΔAUC ≈ 0), and age produced no meaningful reduction (ΔAUC ≈ +0.034, consistent with noise) (Figure 5). These results support treating hip abduction as a candidate signal and reaffirm the established contribution of previous injury, while underscoring instability at EPV ≈ 2.1. Full distributions from bootstrap resampling and detailed permutation importance scores are provided in Supplementary Material 9.

4. Discussion

The primary findings of this investigation revealed that previous hamstring injury was the strongest predictor of subsequent hamstring strain injury (HSI), even though it did not reach conventional significance in the multivariable model, reflecting limited power. This is consistent with extensive literature identifying prior injury as the most robust risk factor in soccer. [6] In addition, reduced dominant-leg hip abduction strength emerged as a candidate signal, although bootstrap resampling and permutation importance indicated that this association was unstable at the current events-per-variable ratio (≈2.1). By contrast, isometric hamstring strength and age did not demonstrate predictive value in this cohort. Although not a primary study objective, descriptive injury surveillance confirmed that sprinting was the most frequently reported injury mechanism, aligning with prior reports that high-speed running is the predominant context for HSIs in soccer. [51,52] Most injuries were low grade, with rapid return-to-play, but details on anatomical site and mechanism could not be verified due to reliance on self-report.

This study is the first to explore dominant-side hip abduction strength as a potential contributor to HSI susceptibility in amateur soccer. The hip abductors play a role in frontal-plane stability and lumbopelvic control, and weakness in this group may compromise energy transfer and increase mechanical demand on the hamstrings. [20,53] However, the instability of our abductor signal highlights the risk of over-interpretation. A clearer mechanistic understanding will require future studies incorporating motion analysis and electromyography to evaluate lumbopelvic control during high-speed running. Although the abductor signal in our model was unstable, the potential link between dominant-side hip abduction weakness and HSI risk is biologically plausible. Prior research has focused primarily on intramuscular risk factors such as eccentric hamstring strength, fascicle length, endurance, and flexibility, while less attention has been given to the role of adjacent anatomical and functional structures, particularly the lumbar spine and pelvic stabilizers [18,19] The hip abductors contribute to frontal-plane stability and energy transfer across the lumbopelvic region. Deficits in this group may compromise trunk and pelvic neuromuscular control, mechanisms that have been linked to lower-limb injury risk [17,53] . For instance, hip abductor weakness may promote excessive knee valgus, thereby increasing tension on the iliotibial band during the early stance phase of running, particularly when deceleration occurs to absorb ground reaction forces. [17] Although our study did not directly measure kinematic variables or muscle activation, our findings are consistent with existing theories suggesting that improved lumbopelvic control may contribute to reduced hamstring injury risk. [53] . The relationship between hamstring strength and HSI risk remains debated. While some prospective studies suggest that eccentric hamstring weakness increases risk, others report limited predictive value [42,43] Our finding that isometric hamstring strength was not significant aligns with the latter, but methodological variability in strength testing and outcome definition likely contribute to discrepancies across studies [54] Furthermore, our finding that age was not significantly associated with injury risk must be interpreted in light of the sample characteristics: the relatively young mean age of participants (20.0 ± 6.96 years) may have limited the ability to detect an age-related effect, which is more consistently observed in older or elite-level cohorts.

The predictive value of Machine learning algorithms

Our model achieved modest discrimination, with a test-set AUC of 0.68 (95% CI 0.52–0.84). This level of performance is comparable to other small-sample studies of preseason screening, where AUC values typically range between 0.60 and 0.75 and rarely exceed 0.80 [22,26,32,33] . While Ayala et al. [29] reported a higher AUC of 0.83 when including hip isometric strength in their model, their approach was subject to similar limitations of small event counts, baseline-only measures, and lack of external validation. The consistent challenge across studies is that models often show promising apparent discrimination but fail to generalize when tested in independent samples. [55] Importantly, discrimination alone provides an incomplete picture. Calibration—the agreement between predicted and observed risks—is equally critical for clinical utility. [56] Our calibration analysis indicated mild overfitting (slope 0.85, intercept −0.12), reinforcing the need for shrinkage and external validation. These issues are common: most sports injury prediction studies are underpowered, with low events-per-variable ratios, which inflates optimism and limits reproducibility. [57] The appeal of machine learning lies in its ability to accommodate complex, non-linear interactions among multiple risk factors. However, systematic reviews consistently show that ML rarely outperforms logistic regression when sample sizes are modest and predictors are limited. [58] Our findings mirror this evidence: logistic regression, combined with internal resampling, provided performance similar to what has been reported for more complex models, with the added benefit of interpretability and transparency. Therefore, while ML methods hold promise for multifactorial risk profiling, their predictive value in practice will depend on large, multicenter datasets, richer longitudinal features (e.g., training load, fatigue indices, neuromuscular control), and rigorous validation strategies. In the current dataset, the model is best interpreted as exploratory, hypothesis-generating, and not ready for clinical deployment.

Strengths and limitations

A key strength of this study was the prospective cohort design, with preseason baseline testing and systematic surveillance across a full competitive season. Monitoring 120 players over 30 weeks ensured consistent exposure tracking in an amateur context. Strength testing followed a standardized and reliable protocol, performed by a single experienced physiotherapist, which minimized inter-rater variability. Although belt fixation was not feasible in the field, examiner-stabilized provide acceptable reliability [30] . Internal validity of the assessments can therefore be considered high. Methodological transparency was another strength. The study adhered to STROBE and incorporated key TRIPOD-AI elements, including clear definitions of predictors, outcomes, and modelling strategy. Beyond conventional outputs, bootstrap resampling and permutation importance analyses were applied to explore model stability, and a PROBAST-AI identified high risk of bias overall, particularly in the analysis domain due to low EPV and lack of external validation. Several limitations must be acknowledged. The number of events was small (21 HSIs), producing a low events-per-variable ratio (~2.1) and increasing risk of overfitting. External or temporally separated validation was not performed, so findings should be interpreted as hypothesis-generating. Injury mechanisms were self-reported and could not be verified by imaging or GPS, limiting precision. The predictor set was restricted to isometric strength and a few contextual variables, excluding neuromuscular or load-related measures. Finally, external validity is limited: the cohort consisted of young male amateur players, and results may not generalize to elite, female, or older populations.

Future directions

Future research on hamstring strain injury prediction should prioritize larger, multicenter cohorts to increase the number of outcome events and improve reliability of modelling. Such studies should follow methodological guidance on prediction model development, ensuring adequate events-per-variable ratios, use of shrinkage, and thorough calibration. External and temporally separated validation datasets are essential to establish generalizability, since even apparently strong discrimination can reflect overfitting in small samples. The predictor set also requires expansion beyond isometric strength. Incorporating eccentric hamstring measures, hip extensor and rotator strength, and dynamic neuromuscular assessments would provide a more complete biomechanical profile. Longitudinal monitoring of workload (e.g., GPS-derived load, training-to-match ratios, acute: chronic workload), along with markers of fatigue, recovery, and contextual factors such as sleep, wellness, or playing position, may better capture the dynamic nature of injury risk. Advanced measurement tools could further clarify mechanistic pathways. Motion capture or wearable sensors can quantify sprinting mechanics, lumbopelvic control, and deceleration strategies, while electromyography may reveal deficits in coordination or activation patterns. These approaches could determine whether the observed signal from hip abductor weakness reflects a true causal role or a spurious finding related to sample limitations. From an analytical perspective, future studies should embed rigorous resampling and validation strategies, including bootstrap, permutation importance, and nested cross-validation. Where datasets are sufficiently large, penalized regression, ensemble methods, or neural networks may be appropriate. However, systematic reviews show that machine learning does not consistently outperform logistic regression in modest datasets. [58] Emphasis should therefore remain on methodological transparency, calibration, and reproducibility rather than algorithm novelty. Finally, translation into practice requires impact evaluation. Beyond development and validation, research should test whether integrating predictors such as hip abduction strength into screening protocols influences clinical decision-making, reduces injury incidence, or improves rehabilitation outcomes. Such impact studies are the final step in the TRIPOD framework but are largely absent in sports injury research. A PROBAST-AI assessment indicated overall high risk of bias, mainly due to the low number of outcome events, the low events-per-variable ratio, and the absence of external validation. Risk was lower in the domains of participants and predictors, given the prospective design and standardized testing, but concerns were high in the analysis domain because of potential overfitting and instability. This reinforces the interpretation of our findings as exploratory and hypothesis-generating.

Clinical relevance

At present, prior hamstring injury remains the most reliable screening variable and should be prioritised when assessing injury risk in soccer players. Hip abductor strength may represent a useful adjunct in the future, but current evidence is preliminary and unstable. Clinicians should not yet base decision-making or prevention strategies on abductor measures alone, but incorporating them into comprehensive screening protocols may become valuable as evidence accumulates.

5. Conclusions

This prospective cohort study identified previous hamstring injury as the most consistent predictor of subsequent HSI, consistent with established evidence. Reduced dominant-leg hip abduction strength also emerged as a candidate risk factor, but its instability under bootstrap and permutation analyses indicates that this association should be interpreted cautiously. Neither isometric hamstring strength nor age were significant predictors in this young amateur cohort. Model performance was modest, calibration indicated mild overfitting, and PROBAST-AI assessment classified the analysis domain as high risk of bias due to the low events-per-variable ratio and absence of external validation. These findings highlight the limitations of preseason strength-based screening for injury prediction, but also point toward the potential relevance of lumbopelvic function—particularly hip abduction strength—for future research. Moving forward, larger multicenter cohorts with richer longitudinal predictors (eccentric strength, load monitoring, biomechanics, neuromuscular control) and robust external validation are required. Until such evidence accumulates, this model should be considered hypothesis-generating rather than clinically actionable.

Supplementary Materials

The following supporting information can be downloaded at the website of this paper posted on Preprints.org.

Author Contributions

Conceptualization: A.K.; methodology: A.K. and F.M.C.; software: R.M.M.; validation: R.M.M.; statistical analysis: R.M.M., A.K. and E.K.; writing—original draft preparation: A.K.; writing—review and editing: E.K., F.M.C. and P.T.N.; supervision: E.K. and F.M.C. All authors have read and agreed to the published version of the manuscript.

Funding

The authors reported there is no funding associated with the work featured in this article.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the Aristotle University Ethical Committee (ERC-012/2019).

Informed Consent Statement

Informed consent was obtained from the parents and guardians of the subjects, and informed assent was obtained from the subjects involved in the study.

Data Availability Statement

The data presented in this study are available upon request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Kekelekis, F. Manuel Clemente, and E. Kellis. Muscle injury characteristics and incidence rates in men’s amateur soccer: A one season prospective study. Research in Sports Medicine 2022, 00, 1–14. [Google Scholar] [CrossRef]
López-valenciano et al.. Epidemiology of injuries in professional football : a systematic review and meta-analysis. pp. 1–9, 2019. [CrossRef]
W. S. A. Al Attar and M. A. Husain. Effectiveness of Injury Prevention Programs With Core Muscle Strengthening Exercises to Reduce the Incidence of Hamstring Injury Among Soccer Players: A Systematic Review and Meta-Analysis. Nov. 01, 2023, SAGE Publications Inc. [CrossRef]
J. Ekstrand, H. Bengtsson, M. Waldén, M. Davison, K. M. Khan, and M. Hägglund. Hamstring injury rates have increased during recent seasons and now constitute 24% of all injuries in men’s professional football: the UEFA Elite Club Injury Study from 2001/02 to 2021/22. Br J Sports Med 2023, 57, 292–298. [Google Scholar] [CrossRef]
E. Eliakim, E. Morgulev, R. Lidor, and Y. Meckel. Estimation of injury costs: Financial damage of English Premier League teams’ underachievement due to injuries. BMJ Open Sport Exerc Med 2020, 6, 1–6. [Google Scholar] [CrossRef]
Green, M. N. Bourne, N. Van Dyk, and T. Pizzari. Recalibrating the risk of hamstring strain injury (HSI): A 2020 systematic review and meta-Analysis of risk factors for index and recurrent hamstring strain injury in sport. Br J Sports Med 2020, 54, 1081–1088. [Google Scholar] [CrossRef]
K. -M. M. H. C. K. C. P. S. H. Y. K.-M. C. Justin W.Y. Lee. Eccentric hamstring strength deficit and poor hamstring-to-quadriceps ratio are risk factors for hamstring strain injury in football- A prospective study of 146 professional players. J Sci Med Sport 2018, 21, 789–793. [Google Scholar] [CrossRef] [PubMed]
G. Henderson, C. A. Barnes, and M. D. Portas. Factors associated with increased propensity for hamstring injury in English Premier League soccer players. J Sci Med Sport 2010, 13, 397–402. [Google Scholar] [CrossRef]
R. G. Timmins, M. N. Bourne, A. J. Shield, M. D. Williams, C. Lorenzen, and D. A. Opar. Short biceps femoris fascicles and eccentric knee flexor weakness increase the risk of hamstring injury in elite football (soccer): A prospective cohort study. Br J Sports Med 2016, 50, 1524–1535. [Google Scholar] [CrossRef]
Delextrat, J. Piquet, M. J. Matthews, and D. D. Cohen. Strength-Endurance Training Reduces the Hamstrings Strength Decline Following Simulated Football Competition in Female Players. Front Physiol, vol. Volume 9- 2018, 2018, [Online], https://www.frontiersin.org/journals/physiology/articles/10.3389/fphys.2018.01059. [Google Scholar]
J. Mendiguchia, E. Alentorn-Geli, and M. Brughelli. Hamstring strain injuries: Are we heading in the right direction?. Feb. 2012. [CrossRef]
C.M. Wille, M. R. Stiffler-Joachim, S. A. Kliethermes, J. L. Sanfilippo, C. S. Tanaka, and B. C. Heiderscheit. Preseason Eccentric Strength Is Not Associated with Hamstring Strain Injury: A Prospective Study in Collegiate Athletes. Med Sci Sports Exerc 2022, 54, 1271–1277. [Google Scholar] [CrossRef]
J. Orchard, J. Marsden, S. Lord, and D. Garlick. Preseason Hamstring Muscle Weakness Associated with Hamstring Muscle Injury in Australian Footballers *.
Nordstrøm, R. Bahr, B. Clarsen, and O. Talsnes. Association Between Preseason Fitness Level and Risk of Injury or Illness in Male Elite Ice Hockey Players: A Prospective Cohort Study. Orthop J Sports Med 2022, 10, 1–10. [Google Scholar] [CrossRef]
N. van Dyk, A. Farooq, R. Bahr, and E. Witvrouw. Hamstring and Ankle Flexibility Deficits Are Weak Risk Factors for Hamstring Injury in Professional Soccer Players: A Prospective Cohort Study of 438 Players Including 78 Injuries. American Journal of Sports Medicine 2018, 46, 2203–2210. [Google Scholar] [CrossRef]
M. N. Bourne, D. A. Opar, M. D. Williams, and A. J. Shield. Eccentric knee flexor strength and risk of hamstring injuries in rugby union. American Journal of Sports Medicine 2015, 43, 2663–2670. [Google Scholar] [CrossRef]
L. Heinert, T. W. Kernozek, J. F. Greany, and D. C. Fater. Hip Abductor Weakness and Lower Extremity Kinematics During Running. 2008.
M. D. Mucha, W. Caldwell, E. L. Schlueter, C. Walters, and A. Hassen. Hip abductor strength and lower extremity running related injury in distance runners: A systematic review. Apr. 01, 2017, Elsevier Ltd. [CrossRef]
A. Alammari, N. Spence, A. Narayan, S. D. Karnad, and Z. C. Ottayil. Effect of hip abductors and lateral rotators’ muscle strengthening on pain and functional outcome in adult patients with patellofemoral pain: A systematic review and meta-analysis. 2023, IOS Press BV. [CrossRef]
J. Schuermans, D. Van Tiggelen, T. Palmans, L. Danneels, and E. Witvrouw. Deviating running kinematics and hamstring injury susceptibility in male soccer players: Cause or consequence? Gait Posture 2017, 57, 270–277. [Google Scholar] [CrossRef]
A. Higashihara, Y. Nagano, K. Takahashi, and T. Fukubayashi. Effects of forward trunk lean on hamstring muscle kinematics during sprinting. J Sports Sci 2015, 33, 1366–1375. [Google Scholar] [CrossRef]
E.S. Chumanov, B. C. Heiderscheit, and D. G. Thelen. The effect of speed and influence of individual muscles on hamstring mechanics during the swing phase of sprinting. J Biomech 2007, 40, 3555–3562. [Google Scholar] [CrossRef] [PubMed]
W. Ben Kibler, J. Press, and A. Sciascia. CURRENT OPINION The Role of Core Stability in Athletic Function. 2006.
A.J. Shield and M. N. Bourne. Hamstring Injury Prevention Practices in Elite Sport: Evidence for Eccentric Strength vs. Lumbo-Pelvic Training. Mar. 01, 2018, Springer International Publishing. [CrossRef]
M. Wallden and N. Walters. Does lumbo-pelvic dysfunction predispose to hamstring strain in professional soccer players? J Bodyw Mov Ther 2005, 9, 99–108. [Google Scholar] [CrossRef]
N. F. N. Bittencourt, W. H. Meeuwisse, L. D. Mendonça, A. Nettel-Aguirre, J. M. Ocarino, and S. T. Fonseca. Complex systems approach for sports injuries: Moving from risk factor identification to injury pattern recognition - Narrative review and new concept. Nov. 01, 2016, BMJ Publishing Group. [CrossRef]
N. Pudjihartono, T. Fadason, A. W. Kempa-Liehr, and J. M. O’Sullivan. A Review of Feature Selection Methods for Machine Learning-Based Disease Risk Prediction. Frontiers Media SA 2022. [CrossRef]
M. Kolodziej et al.. Predictive modeling of lower extremity injury risk in male elite youth soccer players using least absolute shrinkage and selection operator regression. Scand J Med Sci Sports 2023, 33, 1021–1033. [Google Scholar] [CrossRef]
F. Ayala et al.. A Preventive Model for Hamstring Injuries in Professional Soccer: Learning Algorithms. Int J Sports Med 2019, 40, 344–353. [CrossRef]
M. Rico-González, J. Pino-Ortega, A. Méndez, F. Clemente, and A. Baca. Machine learning application in soccer: a systematic review. Biol Sport 2023, 40, 249–263. [Google Scholar] [CrossRef]
R. Muazu Musa, A. P. P. Abdul Majeed, Z. Taha, M. R. Abdullah, A. B. Husin Musawi Maliki, and N. Azura Kosni. The application of Artificial Neural Network and k-Nearest Neighbour classification models in the scouting of high-performance archers from a selected fitness and motor skill performance parameters. Sci Sports 2019. [CrossRef]
N. Rommers et al.. A Machine Learning Approach to Assess Injury Risk in Elite Youth Football Players. Med Sci Sports Exerc 2020, 52, 1745–1751. [Google Scholar] [CrossRef]
B.C. Luu et al.. Machine Learning Outperforms Logistic Regression Analysis to Predict Next-Season NHL Player Injury: An Analysis of 2322 Players From 2007 to 2017. Orthop J Sports Med 2020, 8. [Google Scholar] [CrossRef]
M. Henriquez, J. Sumner, M. Faherty, T. Sell, and B. Bent. Machine Learning to Predict Lower Extremity Musculoskeletal Injury Risk in Student Athletes. Front Sports Act Living 2020, 2. [Google Scholar] [CrossRef]
H. Van Eetvelde, L. D. Mendonça, C. Ley, R. Seil, and T. Tischer. Machine learning methods in sport injury prediction and prevention: a systematic review. Dec. 01, 2021, Springer Science and Business Media Deutschland GmbH. [CrossRef]
J. D. Ruddy et al.. Predictive Modeling of Hamstring Strain Injuries in Elite Australian Footballers. Med Sci Sports Exerc 2018, 50, 906–914. [Google Scholar] [CrossRef]
G.S. Collins et al.. Protocol for development of a reporting guideline (TRIPOD-AI) and risk of bias tool (PROBAST-AI) for diagnostic and prognostic prediction model studies based on artificial intelligence. BMJ Open 2021, 11. [Google Scholar] [CrossRef]
S. Cuschieri. The STROBE guidelines. Saudi J Anaesth 2019, 13, S31–S34. [Google Scholar] [CrossRef] [PubMed]
A.Kekelekis, Z. Kounali, N. Kofotolis, F. M. Clemente, and E. Kellis. Epidemiology of Injuries in Amateur Male Soccer Players: A Prospective One-Year Study. Healthcare 2023, 11, 352. [Google Scholar] [CrossRef]
A. Kekelekis, R. M. Musa, P. T. Nikolaidis, F. M. Clemente, and E. Kellis. Hip Muscle Strength Ratios Predicting Groin Injury in Male Soccer Players Using Machine Learning and Multivariate Analysis—A Prospective Cohort Study. Muscles 2024, 3, 297–309. [Google Scholar] [CrossRef]
H. Kang. Sample size determination and power analysis using the G*Power software. 2021, Korea Health Personnel Licensing Examination Institute. [CrossRef]
D. A. Opar, M. D. Williams, and A. J. Shield. Hamstring strain injuries: Factors that Lead to injury and re-Injury. Sports Medicine 2012, 42, 209–226. [Google Scholar] [CrossRef]
N. Van Dyk et al.. Hamstring and Quadriceps Isokinetic Strength Deficits Are Weak Risk Factors for Hamstring Strain Injuries. American Journal of Sports Medicine 2016, 44, 1789–1795. [Google Scholar] [CrossRef]
A. B. H. M. Maliki et al.. A multilateral modelling of Youth Soccer Performance Index (YSPI). in IOP Conference Series: Materials Science and Engineering, Institute of Physics Publishing, Apr. 2018. [CrossRef]
J. Cohen. A Power Primer. 1992.
M. A. Babyak. What You See May Not Be What You Get: A Brief, Nontechnical Introduction to Overfitting in Regression-Type Models. 2004.
N. Olds, M., McLaine, S., & Magni. Validity and Reliability of the Kinvent Handheld Dynamometer in the Athletic Shoulder Test. J Sport Rehabil. [CrossRef]
M. B. de Almeida et al.. Intra-Rater and Inter-Rater Reliability of the Kinvent Hand-Held Dynamometer in Young Adults. MDPI AG, Aug. 2023, p. 12. [CrossRef]
L. L. Florencio, J. Martins, M. R. B. da Silva, J. R. da Silva, G. L. Bellizzi, and D. Bevilaqua-Grossi. Knee and hip strength measurements obtained by a hand-held dynamometer stabilized by a belt and an examiner demonstrate parallel reliability but not agreement. Physical Therapy in Sport 2019, 38, 115–122. [Google Scholar] [CrossRef]
C. W. Fuller et al.. Consensus statement on injury definitions and data collection procedures in studies of football (soccer) injuries. Mar. 2006. [CrossRef]
G. Tokutake, R. Kuramochi, Y. Murata, S. Enoki, Y. Koto, and T. Shimizu. The risk factors of hamstring strain injury induced by high-speed running. J Sports Sci Med 2018, 17, 650–655. [Google Scholar]
L. Wolski, E. Pappas, C. Hiller, M. Halaki, and A. Fong Yan. Is there an association between high-speed running biomechanics and hamstring strain injury? A systematic review. 2021, Routledge. [Google Scholar] [CrossRef]
J. Schuermans, L. Danneels, D. Van Tiggelen, T. Palmans, and E. Witvrouw. Proximal Neuromuscular Control Protects Against Hamstring Injuries in Male Soccer Players: A Prospective Study with Electromyography Time-Series Analysis during Maximal Sprinting. American Journal of Sports Medicine 2017, 45, 1315–1325. [Google Scholar] [CrossRef] [PubMed]
J. L. Croisier, S. Ganteaume, J. Binet, M. Genty, and J. M. Ferret. Strength imbalances and prevention of hamstring injury in professional soccer players: A prospective study. American Journal of Sports Medicine 2008, 36, 1469–1475. [Google Scholar] [CrossRef]
G. S. Bullock, J. Mylott, T. Hughes, K. F. Nicholson, R. D. Riley, and G. S. Collins. Just How Confident Can We Be in Predicting Sports Injuries? A Systematic Review of the Methodological Conduct and Performance of Existing Musculoskeletal Injury Prediction Models in Sport. Oct. 01, 2022, Springer Science and Business Media Deutschland GmbH. [CrossRef]
E. W. Steyerberg et al.. Poor performance of clinical prediction models: the harm of commonly applied methods. J Clin Epidemiol 2018, 98, 133–143. [Google Scholar] [CrossRef] [PubMed]
B. Shiferaw, M. Roloff, D. Waltemath, and A. A. Zeleke. Guidelines and Standard Frameworks for AI in Medicine: Protocol for a Systematic Literature Review. 2023, JMIR Publications Inc. [CrossRef]
E. Christodoulou, J. Ma, G. S. Collins, E. W. Steyerberg, J. Y. Verbakel, and B. Van Calster. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. Jun. 01, 2019, Elsevier USA. [CrossRef]

Figure 1. ROC curve for the logistic regression model on the test set. Area under the curve (AUC) = 0.68 (95% CI 0.52–0.84).

Figure 2. Calibration plot for the logistic regression model on the test set. Calibration plot comparing predicted vs observed probabilities of hamstring injury in the test set. The calibration slope was 0.85, and the intercept was −0.12, indicating mild overfitting with slight underestimation of true risk.

Figure 3. Variable ranking based on symmetrical uncertainty. Predictor importance ranking for the 20 candidate variables. Ten predictors (hatched bars) were retained for model development, with dominant-leg hip abduction strength (HipAbd_D) ranked highest. Rankings are exploratory and should be interpreted with caution given the small number of outcome events (EPV = 2.1).

Figure 4. Bootstrap distribution of AUC values for the logistic regression model (n = 200 resamples, test set evaluation).The vertical dashed line indicates the mean AUC = 0.68. The distribution shows modest discrimination, with most resampled values clustering between 0.60 and 0.75, consistent with the limited number of outcome events.

Figure 5. Permutation importance of key predictors in the test set. Bars represent the mean change in AUC (ΔAUC) when the predictor was permuted (50 permutations), with error bars showing the standard deviation. Previous injury produced the largest reduction in model performance, dominant-leg hip abduction strength had a modest contribution, while hamstring strength and age had negligible or inconsistent effects.

Table 1. Cross-validation performance of the logistic regression model for predicting hamstring strain injury risk. Metrics are reported as mean ± SD across cross-validation folds.

Metric	Mean	SD	Description
Accuracy (%)	69.9	3.8	Proportion of all correctly classified cases
AUC	0.792	0.064	Area under the ROC curve
Sensitivity (Recall)	0.700	0.073	True positive rate (injured correctly class.)
Precision	0.901	0.053	Positive predictive value
F1 Score	0.787	0.040	Harmonic mean of precision and recall

Table 2. Final multivariable logistic regression model for hamstring strain injury (player-level outcome) Note: OR = odds ratio; CI = confidence interval; D = dominant leg; ND = non-dominant leg. Full model specification is provided in Supplementary Table S4.

Predictor	β (SE)	OR	95% CI	p-value
Intercept	0.963 (4.334)	2.62	0.10 – 12.80	0.824
Age (years)	−0.012 (0.050)	0.99	0.90 – 1.09	0.808
BMI (kg/m²)	0.034 (0.136)	1.03	0.79 – 1.35	0.805
Previous hamstring injury	−1.283 (0.805)	0.28	0.06 – 1.34	0.111
Hip abduction (dominant leg)	−0.200 (0.083)	0.82	0.70 – 0.96	0.016*
Hip flexion (non-dominant leg)	0.108 (0.067)	1.11	0.98 – 1.27	0.109
Hip adduction ratio (D/ND)	0.939 (1.520)	2.56	0.13 – 50.27	0.536
Hip abduction ratio (D/ND)	−0.414 (1.121)	0.66	0.07 – 5.95	0.712
Hamstring ratio (D/ND)	−0.346 (1.541)	0.71	0.04 – 14.51	0.822
Hip flexion ratio (D/ND)	2.304 (2.251)	10.02	0.12 – 826.22	0.306
Hip flexion/hamstring ratio (dominant leg)	−0.621 (0.947)	0.54	0.08 – 3.44	0.512

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.