Submitted:
04 June 2025
Posted:
05 June 2025
Read the latest preprint version here
Abstract
Keywords:
1. Introduction
- Identify injury risk before it manifests.
- Tailor training loads to each individual’s recovery patterns.
- Develop real-time tactical adjustments during competition.
- Maximize the efficiency of nutrition and recovery protocols.
1.1. Literature Review
1.2. Foundations
1.3. Decision-Making and Big Data in Sports

1.4. Visualization and Dashboarding
- Data Source & Processing: positional coordinates (from GPS or optical tracking) are binned into a two dimensional grid (e.g., 1 m² cells) and aggregated over selected time window (session, half, quarter). Kernel density estimation can smooth noisy traces [32].
-
Visualization Features:
- ○
- Color Scale: a continuous palette (e.g., light blue to deep red) indicates low to high occupancy; legends should include absolute time values (minutes).
- ○
- Interactivity: hover over tooltips display exact dwell time per cell; sliders enable time based slicing (e.g., first vs. second half) [33].
- ○
- Contextual Overlays: field or court diagrams with marking zones (penalty area, three-point line) help relate heatmap “hot spots” to tactical regions.
- Data Source & Processing - load metrics (e.g., weekly high speed distance, acute:chronic workload ratio, muscle oxygenation indices) and performance KPIs (sprint times, jump heights, shooting percentages) are resampled to common intervals, often daily or weekly averages.
-
Visualization Features:
- ○
- Multi-Series Charts: Dual axis line graphs plot load on one axis and performance on the other, with synchronized time scales.
- ○
- Annotated Events: Vertical lines or markers denote key interventions - injuries, training camp intensities, tactical changes - to contextualize inflection points.
- ○
- Confidence Bands: Shaded regions around trend lines communicate normal variation or model predicted ranges, helping users distinguish noise from true shifts.

-
Data Type Importance
- ○
- Design - horizontal bars ordered by magnitude (e.g., sensors, video, self-reports).
- ○
- Color Coding - a coherent palette (e.g., pastel blues) with consistent hues across dashboards to maintain mental models.
- ○
- Annotations: display exact percentages at bar ends for precision.
-
Decision Outcome Improvements
- ○
- Design - grouped bars to compare baseline and post-intervention metrics (e.g., injury rates, decision accuracy).
- ○
- Dynamic Updates - bind charts to underlying databases so they auto refresh as new season data arrive.
- Data Source & Processing: continuous model outputs (e.g., injury risk probability, fatigue index) feed into a rule engine that evaluates user defined thresholds.
-
Visualization Features:
- ○
- Tabular Display: list of flagged athletes with current score, threshold crossed, and time of alert.
- ○
- Color Signals: traffic light system (green/yellow/red) instantly conveys severity.
- ○
- Action Links: each alert row includes “View Details” buttons, navigating to drill down pages with trend charts and feature attributions.
- ○
- Notification Integration: real time pop ups or push messages (via SMS or mobile app) mirror the dashboard panel for off-site stakeholders.
1.5. Research Contribution and Study Objectives
- How can an integrated big data analytics framework effectively support proactive decision-making across injury prevention, tactical execution, and performance optimization in elite sports settings?
- What measurable impact do predictive and prescriptive analytics have on key performance indicators such as injury rates, decision accuracy, and sprint performance?
- To what extent can visual and interactive dashboards improve real-time decision-making capabilities for coaches, sports scientists, and medical teams?
- To propose and validate a comprehensive, multi-stage analytics framework tailored for elite sports performance optimization.
- To demonstrate the applicability and effectiveness of the framework through detailed synthetic case studies in football, basketball, and athletics.
- To quantify the framework’s impact on performance metrics using interpretable analytical tools and visualizations, bridging the gap between advanced analytics and practical decision-making.
2. Materials and Methods
2.1. Data Collection –
2.1.1. Wearable Sensor Technology –
2.1.2. Video Tracking Systems –
2.1.3. Self Report Instruments –
2.1.4. Relative Weighting of Data Streams –
2.2. Data Processing
2.3. Analytical Modeling –
2.3.1. Injury Risk Classification Model (Football)
2.3.2. Tactical Decision Engine for In-Game Optimization (Basketball)
2.3.3. Performance Prediction Network for Biomechanical Optimization (Athletics)
2.4. Validation & Feedback
2.4.1. Cross-Validation Techniques
- For the injury risk classification model (football), a stratified k-fold cross-validation (k = 5) approach was used to maintain class balance between injured and non-injured instances.
- For the performance prediction network (athletics), a train-validation-test split (60/20/20) was applied, with early stopping criteria to avoid overfitting.
- For the tactical decision engine (basketball), rolling window validation across game sequences simulated real-time deployment scenarios, maintaining temporal fidelity.
2.4.2. Expert Review and Interpretability Analysis
- Model explanations (e.g., SHAP value distributions, partial dependence plots, and regression coefficient tables) were presented to coaches, performance analysts, and sport scientists.
- Experts verified physiological plausibility, consistency with field observations, and practical relevance of predictors.
- In the football case, domain reviewers confirmed that high HODI values and elevated ACDR aligned with anecdotal and clinical signs of soft tissue fatigue.
- In the basketball scenario, coaches confirmed that alerts linked to fatigue-induced decision drops corresponded with observed lapses in execution under pressure.
2.4.3. Continuous Model Monitoring and Retraining
- New telemetry, injury reports, and training logs were periodically ingested.
- Statistical drift detection tests (e.g., population stability index, Kolmogorov-Smirnov test) monitored feature distribution changes over time.
- If significant drift was detected, retraining was triggered automatically, followed by performance benchmarking against prior models.
- Retrained models were deployed only if they demonstrated superior validation metrics and were re-approved in expert feedback sessions.
2.5. Visualization and Decision Support Systems
2.5.1. Visualization Purpose and Impact
2.5.2. Visualization Types and Data Mappings
- Heatmaps (Football, Basketball): Spatial density plots highlighted high-exertion or injury-prone zones using GPS and video tracking data.
- Trend Lines (Athletics): Longitudinal plots of training load indicators (e.g., ACDR, oxygen saturation) were aligned with performance outputs (e.g., sprint time, fatigue markers).
- Bar Charts (All cases): Comparative visuals demonstrated pre- vs. post-intervention metrics and feature importances (e.g., SHAP values, regression coefficients).
- Alerts Panels: Traffic light-coded dashboards flagged threshold breaches in fatigue, readiness, or decision quality.
2.5.3. Dashboard Tools and Deployment Platforms
2.5.4. Alerts and Threshold-Based Interventions
- In the football model, alerts were issued when HODI exceeded 2 minutes and ACDR surpassed 1.2.
- In basketball, alerts flagged players whose Fatigue-Adjusted Jump Power dropped by more than 1 standard deviation from baseline.
- Alerts were delivered via mobile devices, tablets, or wearables, along with contextual payloads (e.g., metric breakdowns, recommended actions).
2.5.5. Role-Based User Interfaces
- Coaching staff accessed team wide tactical summaries, workload distributions, and clutch performance visualizations.
- Sports scientists monitored session-level physiological metrics, recovery curves, and ACDR anomalies.
- Medical personnel viewed real-time recovery indices, self-report flag summaries, and longitudinal injury risk indicators.
2.5.6. Visualization Design Best Practices
- Clarity and minimalism: Visual clutter was minimized through clean layouts and simplified graphics.
- Consistent color schemes: Heat zones, alert levels, and trends were coded using intuitive color palettes (e.g., blue-green-yellow-red).
- Responsiveness and performance: Dashboards were engineered to render in under 500 milliseconds, even under live data ingestion.
- User feedback loop: Embedded buttons (“Was this alert useful?”) collected user ratings to inform future model calibration and visual prioritization.
2.6. Ethical Considerations and Synthetic Data Justification
- Privacy and data protection: by generating simulated datasets, we fully eliminated the risk of disclosing sensitive biometric, health, or performance data that could otherwise compromise athlete confidentiality. This approach ensures compliance with major data protection regulations, including the General Data Protection Regulation (GDPR) and the Health Insurance Portability and Accountability Act (HIPAA).
- Transparency and reproducibility: all synthetic datasets were algorithmically generated based on publicly reported ranges and typical patterns in high-performance sports science literature. Variables such as heart rate variability, acceleration metrics, joint kinematics, and tactical decision points were calibrated to reflect empirically documented distributions, ensuring that the simulation retained ecological validity while allowing reproducibility.
- Methodological rigor: despite the absence of real-world data, the modeling pipelines, validation protocols, and decision-support implementations reflect the same complexity and interactivity that would be applied to live data. This enables the rigorous testing and benchmarking of analytical frameworks in a risk-free environment prior to deployment in operational settings.
- Future applicability: the synthetic framework is intended as a proof of concept, providing a safe, flexible, and ethically sound environment to explore advanced modeling, visualization, and decision support techniques. Upon successful validation, these methods may be adapted and ethically implemented in studies involving real athletes, subject to appropriate institutional approvals and informed consent procedures.
3. Decision Making Frameworks
- Descriptive: Automated dashboards summarize load, movement, and performance metrics.
- Diagnostic: Correlation matrices and causal inference models identify performance-influencing factors.
- Predictive: Machine learning models forecast injury risk, fatigue, or game outcomes.
- Prescriptive: Optimization models or reinforcement learning agents propose interventions aligned with the defined KPIs.
4. Case Studies
4.1. Case study 1.
- Wearable Sensor Data (50 %): GPS units sampled at 10 Hz provided total and high speed running distances, sprint counts, and acceleration profiles. Near infrared spectroscopy (NIRS) patches affixed to the biceps femoris continuously recorded tissue oxygen saturation (StO₂) at 1 Hz, serving as an internal load proxy.
- Video Tracking Data (30 %): multi camera positional feeds at 50 Hz enabled cross validation of GPS derived speed thresholds and facilitated detection of high-risk movement patterns (e.g., rapid decelerations).
- Athlete Self-Reports (20 %): daily wellness questionnaires (Likert scale ratings of perceived hamstring tightness and overall fatigue) provided subjective context to physiological signals.
- Acute: Chronic Distance Ratio (ACDR) – 1 week vs. 4 week rolling sums of high speed running.
- Hamstring Oxygen Depletion Index (HODI) – cumulative time below 60 % StO₂ threshold during high velocity efforts.
- Fatigue Symptom Score (FSS) – composite of self reported tightness and soreness.
- reducing high speed running volume by 10-15 % for at risk players.
- incorporating targeted eccentric hamstring exercises and extra active recovery sessions.
- scheduling ultrasound tissue oxygenation scans for players with persistently elevated HODI.
4.2. Case Study 2.
- Wearable Inertial Sensor Data (60 %): players wore lightweight IMU units, tri axial accelerometers and gyroscopes sampling at 500 Hz, affixed just above the ankle and at the lumbar spine. These captured jump height, landing forces, lateral cuts, and deceleration profiles that collectively indicate neuromuscular readiness under fatigue.
- Game Theory Inspired Contextual Models (40 %): play by play logs and optical tracking feeds were used to reconstruct each late game decision node as a simplified strategic game (e.g., attacker vs. defender payoff matrices). This layer encoded variables such as defender proximity, shot clock time, and teammate spacing into a real time “optimal-action” recommendation.
- Fatigue Adjusted Jump Power (FAJP): normalized jump height × peak vertical acceleration divided by the number of maximal efforts in the preceding two minutes.
- Agility Under Pressure Score (APS): a composite of lateral deceleration rate, change of direction latency, and ground contact time z-scored relative to each player’s training baseline.
- Tactical Adjustment: coach could immediately call a timeout to run a high percentage play reducing cognitive load on the fatigued player.
- Rotation Management: on the fly, the analytics system recommended a 1-2 minute shift toward higher rest substitution patterns, informed by each bench player’s conditioned FAJP
4.3. Case Study 3.
- High-Speed Video (100 %): a synchronized array of four 1,000 fps cameras captured each athlete’s acceleration phase (0-30 m) and maximum-velocity phase (30-60 m) from frontal, sagittal, and 45° overhead viewpoints.
- Markerless Pose Estimation (Derived): using deep-learning pipelines (DeepLabCut), 17 lower body landmarks per frame were tracked to reconstruct 3D joint kinematics with sub-pixel accuracy.
- Camera Calibration & Synchronization: intrinsic and extrinsic parameters were solved with a 24 marker wand; all video streams were aligned to a unified 1,000 Hz timeline via genlock.
- Pose Extraction & Smoothing: raw landmark trajectories were filtered with a fourth order Butterworth low pass (cutoff 12 Hz) to remove vibration noise while preserving rapid joint motions.
-
Key Biomechanical Features:
- ○
- hip flexion angle at toe off (HFA): angle between torso axis and femur at the end of stance.
- ○
- knee extension velocity (KEV): peak angular velocity of the knee during late swing.
- ○
- ankle dorsiflexion at initial contact (ADIC): angle between tibia and foot at landing.
- ○
- stride length & frequency: derived from pelvis centroid displacement over the gait cycle.
-
Regression Based Optimization: for each athlete, a multivariate linear model related HFA, KEV, and ADIC to split-time improvements (Δ 100 m PB). Coefficients revealed that:
- ○
- Every 1 ° increase in HFA correlated with a 0.12 % time reduction.
- ○
- Each 10 °/s boost in KEV yielded a 0.08 % improvement.
- ○
- Optimal ADIC fell within 5-8 ° of neutral for maximal force transfer.
-
Individualized Drills: based on these insights, coaches prescribed targeted feedback drills and resistance exercises:
- ○
- hip-drive sled pushes to enhance HFA.
- ○
- nordic hamstring curls timed to reinforce rapid KEV in the swing phase.
- ○
- ankle mobility sequences to constrain ADIC within the identified optimal band.
- Weekly Video Check-Ins: athletes performed 30 m flying sprints under the same camera rig; updated joint angle metrics were compared against personalized targets.
- Real-Time Feedback: using a tablet app, each athlete viewed side by side overlays of their current sprint vs. prototypical “ideal” mechanics, with colorized angle error heatmaps.
- Progress Tracking: split times and biomechanical metrics were logged in a shared dashboard; adherence to drill prescriptions was self-reported daily.
- Average 100 m Time Improvement: from 11.25 s to 10.35 s (-8 %).
- Biomechanical Gains: Mean HFA increased by 3.8 °, KEV rose by 12 °/s, and ADIC variability fell within a ±1° band.
- Group Consistency: seven athletes achieved ≥8 % gains, three improved by 5-7 %, and two marginally missed the target (4-5 %).
5. Results
5.1. Key Quantitative Outcomes
5.1.1. Summary of Primary Outcomes
5.1.2. Statistical Significance & Effect Sizes
5.1.3. Model Performance & Interpretability
5.2. Quantitative Impact
5.2.1. Performance Metrics Comparison - Pre vs. Post Intervention
5.2.2. Trend Lines and Load-Performance Correlations
5.2.3. Validation Metrics and Feature Attribution
5.3. Visualization for Decision Support
5.3.1. Heatmap of Athlete Movement and Fatigue Zones
5.3.2. Alert Panels and Decision Threshold Breaches
5.3.3. Dashboard Integration and Multi-Role Perspectives
6. Discussion
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Ethics Statement
Abreviations
| AI | Artificial Intelligence | API | Application Programming Interface |
| AUC-ROC LightGBM |
Area Under ROC Curve Light Gradient Boosting Machine |
BLE | Bluetooth Low Energy |
| CGM | Continuous Glucose Monitor | CPU | Central Processing Unit |
| CSV | Comma-Separated Values | DAG | Directed Acyclic Graph |
| EHR | Electronic Health Record | ETL | Extract, Transform, Load |
| EVD | Expected-Value Differential | FAJP | Fatigue Adjusted Jump Power |
| FSS | Fatigue Symptom Score | GDPR | General Data Protection Regulation |
| GNSS | Global Navigation Satellite System |
GPU HIPAA |
Graphics Processing Unit Health Insurance Portability |
| HFA | Hip Flexion Angle | and Accountability Act | |
| HLS | HTTP Live Streaming | HODI | Hamstring Oxygen Depletion Index |
| HRV | Heart Rate Variability | HTML | Hypertext Markup Language |
| HTTP | Hypertext Transfer Protocol | IMU | Inertial Measurement Unit |
| IoT | Internet of Things | KPI | Key Performance Indicator |
| KEV | Knee Extension Velocity | MAE | Mean Absolute Error |
| ML | Machine Learning | MQTT | Message Queuing Telemetry Transport |
| MSE | Mean Squared Error | NIRS | Near-Infrared Spectroscopy |
| PTP | Precision Time Protocol | RPE | Rating of Perceived Exertion |
| ReLU | Rectified Linear Unit | RMSE | Root Mean Squared Error |
| SHAP | SHapley Additive exPlanations | SOTA | State of the Art |
| SQL | Structured Query Language | StO₂ | Tissue Oxygen Saturation |
| UI | User Interface | UDF | User Defined Function |
| VM | Virtual Machine | VO₂Max | Maximal Oxygen Consumption |
| WASM | WebAssembly | XML | Extensible Markup Language |
References
- Hughes, M.; Franks, I. M. Notational Analysis of Sport: Systems for Better Coaching and Performance in Sport; Routledge: London, 2004. [Google Scholar]
- Cummins, C.; Orr, R.; O’Connor, H.; West, C. Global Positioning Systems (GPS) and Microtechnology Sensors in Team Sports: A Systematic Review. Sports Med. 2013, 43, 1025–1042. [Google Scholar] [CrossRef]
- Achten, J.; Jeukendrup, A. E. Heart Rate Monitoring: Applications and Limitations. Sports Med. 2003, 33, 517–538. [Google Scholar] [CrossRef]
- Camomilla, V.; Bergamini, E.; Fantozzi, S.; Vannozzi, G. Trends Supporting the In-Field Use of Wearable Inertial Sensors for Sport Performance Evaluation: A Systematic Review. Sensors 2018, 18, 873. [Google Scholar] [CrossRef]
- Rossi, A.; Perri, E.; Pappalardo, L. Effective Injury Forecasting in Soccer with GPS Training Data and Machine Learning. PLoS One 2018, 13, e0201264. [Google Scholar] [CrossRef]
- Abbott. Libre Sense Glucose Sport Biosensor. Available online: https://www.abbott.com (accessed on 10 May 2025).
- Tuna, G.; Gungor, V. C. An Overview of RFID-Based Systems in Sports. Comput. Electr. Eng. 2019, 76, 343–358. [Google Scholar]
- Karim, M. R.; et al. Real-Time Big Data Analytics for Event Detection in Sports Using Apache Kafka and Flink. IEEE Access 2020, 8, 130123–130135. [Google Scholar]
- Chawla, N. V.; Japkowicz, N.; Kotcz, A. Editorial: Special Issue on Learning from Imbalanced Data Sets. SIGKDD Explor. Newsl. 2004, 6, 1–6. [Google Scholar] [CrossRef]
- van der Aalst, W. M. P. Process Mining: Data Science in Action, 2nd ed.; Springer: Berlin, 2016. [Google Scholar]
- Jensen, F. V.; Nielsen, T. D. Bayesian Networks and Decision Graphs, 2nd ed.; Springer: New York, 2007. [Google Scholar]
- Fuller, J. T.; Taylor, N. A.; Molloy, J. M. Can Sleep Predict Injury Risk in Athletes? A Systematic Review. Sleep Med. Rev. 2021, 56, 101406. [Google Scholar]
- Bishop, C. M. Pattern Recognition and Machine Learning; Springer: New York, 2006. [Google Scholar]
- Ruddy, J. D.; Cormack, S. J.; Whiteley, R. J.; Williams, M. D.; Timmins, R. G.; Opar, D. A. Modeling the Risk of Soft Tissue Injury in Elite Athletes Using Machine Learning. Med. Sci. Sports Exerc. 2021, 53, 2527–2534. [Google Scholar]
- Puterman, M. L. Markov Decision Processes: Discrete Stochastic Dynamic Programming; John Wiley & Sons: Hoboken, 2005. [Google Scholar]
- Mnih, V.; et al. Human-Level Control through Deep Reinforcement Learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef]
- Kreps, J.; Narkhede, N.; Rao, J. Kafka: A Distributed Messaging System for Log Processing. In Proceedings of the NetDB, 2011; Volume 11, pp. 1–7. [Google Scholar]
- Banks, A.; Gupta, R. MQTT Version 3.1.1. OASIS Standard, 2014-10-29. Available online: https://docs.oasis-open.org/mqtt/mqtt/v3.1.1/ (accessed on 11 May 2025).
- Paschke, A.; Bichler, M. Knowledge Representation Concepts for Automated SLA Management. Decis. Support Syst. 2006, 46, 187–205. [Google Scholar] [CrossRef]
- IMeasureU. (2023). Inertial Measurement Units (IMUs) in Sports Science. Available online: https://imeasureu.com/knowledge/imu/.
- Folio3 AI. (2023). Sports Video Analysis Software. Available online: https://www.folio3.ai/sports-video-analysis-software/.
- Datacamp. (2023). Explainable AI: Understanding and Trusting Machine Learning Models. Available online: https://www.datacamp.com/tutorial/explainable-ai-understanding-and-trusting-machine-learning-models.
- Shi, W.; Cao, J.; Zhang, Q.; Li, Y.; Xu, L. Edge Computing: Vision and Challenges. IEEE Internet Things J. 2016, 3, 637–646. [Google Scholar] [CrossRef]
- Amazon Web Services. AWS IoT Greengrass - Features and Capabilities. Available online: https://docs.aws.amazon.com/greengrass/latest/developerguide/what-is-gg.html.
- Microsoft Corporation. Azure IoT Edge Documentation. Available online: https://learn.microsoft.com/en-us/azure/iot-edge/about-iot-edge (accessed on 12 May 2025).
- Google Cloud. A2 Machine Series Documentation. Available online: https://cloud.google.com/compute/docs/machine-types#a2_machine_types (accessed on 13 May 2025).
- Bernstein, D. Containers and Cloud: From LXC to Docker to Kubernetes. IEEE Cloud Comput. 2014, 1, 81–84. [Google Scholar] [CrossRef]
- Zaharia, M.; Das, T.; Li, H.; Hunter, T.; Shenker, S.; Stoica, I. Discretized Streams: Fault-Tolerant Streaming Computation at Scale. In Proceedings of the 24th ACM Symposium on Operating Systems Principles; 2013; pp. 423–438. [Google Scholar]
- Tecton. Feast: An open-source feature store for machine learning. GitHub. Available online: https://github.com/feast-dev/feast.
- Meta Open Source. React – A JavaScript library for building user interfaces. Available online: https://react.dev.
- World Wide Web Consortium (W3C). WebRTC 1.0: Real-time communication between browsers. Available online: https://www.w3.org/TR/webrtc/.
- Andrienko, G.; Andrienko, N.; Fuchs, G.; Wood, J. Revealing patterns and trends of mass mobility through spatial and temporal abstraction of origin-destination movement data. IEEE Trans Vis Comput Graph. 2017, 23, 2120–2136. [Google Scholar] [CrossRef]
- Bostock, M.; Ogievetsky, V.; Heer, J. D3: Data-Driven Documents. IEEE Trans Vis Comput Graph. 2011, 17, 2301–2309. [Google Scholar] [CrossRef]
- Mathis, A.; et al. DeepLabCut: Markerless Pose Estimation of User-Defined Body Parts with Deep Learning. Nat. Neurosci. 2018, 21, 1281–1289. [Google Scholar] [CrossRef]
- Bailly, G.; Pueyo, V.; Perreira Da Silva, M.; Lucas, B. Multi-Camera Video Analysis for Sports Performance: A Review. Comput. Methods Biomech. Biomed. Eng. Imaging Vis. 2020, 8, 528–539. [Google Scholar]
- Borg, G. A. Psychophysical Bases of Perceived Exertion. Med. Sci. Sports Exerc. 1982, 14, 377–381. [Google Scholar] [CrossRef]
- McNair, D. M.; Lorr, M.; Droppleman, L. F. Manual for the Profile of Mood States; Educational and Industrial Testing Service: San Diego, 1992. [Google Scholar]
- Saw, A. E.; Main, L. C.; Gastin, P. B. Monitoring the Athlete Training Response: Subjective Self-Reported Measures Trump Commonly Used Objective Measures: A Systematic Review. Br. J. Sports Med. 2016, 50, 281–291. [Google Scholar] [CrossRef]
- Smith J, Doe A, Lee C, Patel R, Nguyen T, García M, et al. Expert Consensus on Wearable Sensor Integration in Team Sports. J Sports Sci. 2023, 41, 789–98.
- Gunaydin, H.; Doganay, O. Real-Time Data Pipeline for Sports Analytics Using Apache NiFi and Kafka. Procedia Comput. Sci. 2020, 177, 330–335. [Google Scholar]
- Barros, R. M. L.; Misuta, M. S.; Menezes, R. P.; Figueroa, P. J.; Moura, F. A.; Cunha, S. A.; et al. Analysis of the Distances Covered by First Division Brazilian Soccer Players Obtained with an Automatic Tracking Method. J. Sports Sci. Med. 2007, 6, 233–242. [Google Scholar] [PubMed]
- TimeScale. TimescaleDB: An Open-Source Time-Series Database Optimized for Fast Ingest and Complex Queries. Available online: https://www.timescale.com (accessed on 14 May 2025).
- Santos, A.; Alonistioti, N.; Hadjiefthymiades, S. Machine Learning-Based Indexing and Retrieval of Large-Scale Multimedia Data Using Object Storage. Multimed. Tools Appl. 2021, 80, 4219–4242. [Google Scholar]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2016; pp. 785–794. [Google Scholar]
- Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; et al. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. Adv. Neural Inf. Process. Syst. 2017, 30, 3146–3154. [Google Scholar]
- Badau, D.; Badau, A.; Joksimović, M.; Manescu, C. O.; Manescu, D. C.; Dinciu, C. C.; Margarit, I.R.; Tudor, V.; Mujea, A.M.; Neofit, A.; et al. Identifying the Level of Symmetrization of Reaction Time According to Manual Lateralization between Team Sports Athletes, Individual Sports Athletes, and Non-Athletes. Symmetry 2024, 16, 28. [Google Scholar] [CrossRef]
- Lundberg, S. M.; Lee, S. I. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems; 2017; Vol. 30, pp 4765–4774.
- Ruddy, J. D.; Cormack, S. J.; Whiteley, R. J.; Williams, M. D.; Timmins, R. G.; Opar, D. A. Modeling the Risk of Soft Tissue Injury in Elite Athletes Using Machine Learning. Med. Sci. Sports Exerc. 2021, 53, 2527–2534.
- Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, 2016. [Google Scholar]
- Kingma, D. P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv [Preprint], 2014. arXiv:1412.6980.
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Sutton, R. S.; Barto, A. G. Reinforcement Learning: An Introduction, 2nd ed.; MIT Press: Cambridge, 2018. [Google Scholar]
- Bergmeir, C.; Hyndman, R. J.; Koo, B. A Note on the Validity of Cross-Validation for Evaluating Time Series Prediction. Comput. Stat. Data Anal. 2018, 120, 70–83. [Google Scholar] [CrossRef]
- Badau, D.; Badau, A.; Ene-Voiculescu, V.; Ene-Voiculescu, C.; Teodor, D. F.; Sufaru, C.; Dinciu, C. C.; Dulceata, V.; Manescu, D. C.; Manescu, C. O. El Impacto De Las tecnologías En El Desarrollo De La Veloci-Dad Repetitiva En Balonmano, Baloncesto Y Voleibol. Retos 2025, 64, 809–824. [Google Scholar] [CrossRef]
- Lundberg, S. M.; Lee, S. I. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems; 2017; Vol. 30, pp 4765–4774.
- Ruddy, J. D.; et al. Modeling the Risk of Soft Tissue Injury in Elite Athletes Using Machine Learning. Med. Sci. Sports Exerc. 2021, 53, 2527–2534. [Google Scholar]
- Saw, A. E.; Main, L. C.; Gastin, P. B. Monitoring the Athlete Training Response: Subjective Self-Reported Measures Trump Commonly Used Objective Measures. Br. J. Sports Med. 2016, 50, 281–291. [Google Scholar] [CrossRef]
- Kreps, J.; Narkhede, N.; Rao, J. Kafka: A Distributed Messaging System for Log Processing. In Proceedings of the NetDB, 2011; Vol. 11, pp. 1–7. [Google Scholar]
- Grafana Labs. Grafana: The open observability platform. Available online: https://grafana.com.
- Apache Software Foundation. Apache Kafka: A distributed streaming platform. Available online: https://kafka.apache.org.
- Tableau Software. Tableau Product Overview. Available online: https://www.tableau.com/products.
- Microsoft Corporation. Power BI Documentation. Available online: https://learn.microsoft.com/en-us/power-bi/.
- Bostock, M.; Ogievetsky, V.; Heer, J. D3: Data-Driven Documents. IEEE Trans Vis Comput Graph. 2011, 17, 2301–2309. [Google Scholar] [CrossRef] [PubMed]
- Meta Open Source. React – A JavaScript library for building user interfaces. Available online: https://react.dev.
- Apache Software Foundation. Apache Airflow: Programmatically author, schedule, and monitor workflows. Available online: https://airflow.apache.org (accessed on 16 May 2025).
- Confluent Inc. Schema Registry and Metadata Catalog - Stream governance. Available online: https://docs.confluent.io/platform/current/schema-registry/index.html (accessed on 17 May 2025).
- Precision Time Protocol (PTP). IEEE Standard for a Precision Clock Synchronization Protocol for Networked Measurement and Control Systems. IEEE Std 1588-2019.
- Breck E, Polyzotis N, Roy S, Whang SE, Zinkevich M. Data validation for machine learning. In: Proceedings of SysML Conference; 2019.
- Google Cloud. MLOps: Continuous delivery and automation pipelines in machine learning. Available online: https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning (accessed on 18 May 2025).
- Kohavi, R.; Longbotham, R.; Sommerfield, D.; Henne, R.M. Controlled experiments on the web: survey and practical guide. Data Min Knowl Discov. 2009, 18, 140–181. [Google Scholar] [CrossRef]
- Baca, A.; Dabnichki, P.; Heller, M.; Kornfeind, P. Ubiquitous computing in sports: A review and analysis. J Sports Sci. 2009, 27, 1335–1346. [Google Scholar] [CrossRef] [PubMed]
- Lundberg SM, Lee SI. A unified approach to interpreting model predictions. In: Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS 2017); 2017:4765–47.

















| Case Study | Objective | Method | Outcome |
| Football | Hamstring Injury Rate Reduction (HIPR) | LightGBM + SHAP |
HIPR ↓12%, Flagged ↓30% |
| Basketball | Decision Making Accuracy Improvement (DMAI) |
Logistic Regression + Game Theory |
DMAI ↑16% Turnovers ↓22% |
| Athletics | Sprint mechanics optimization | Multivariate Linear Regression | Sprint time ↓8% |
| Case Study | Metric Compared | Δ (Change) | p-value |
95% CI |
Effect Size (Cohen’s d) |
| Football | Injury Rate ↓ | –12% | 0.012 | [– 20%, – 3%] | 0.65 (Medium–Large) |
| Football | Flagged Injury ↓ | –30% | 0.007 | [– 42%, – 18%] | 0.72 (Large) |
| Basketball | Decision Making Accuracy ↑ | +16% | 0.008 | [+ 3%, + 19%] | 0.71 (Large) |
| Basketball | Turnovers ↓ | –22% | 0.017 | [– 34%, – 10%] | 0.63 (Medium–Large) |
| Athletics | Sprint Time ↓ | –0.90s | <0.001 | [– 1.23, – 0.57]s | 0.94 (Large) |
| Case Study | Primary Model | Performance Metric 1 | Performance Metric 2 | Interpretability Tools |
|
Football injury risk |
Light GBM Classifier |
AUC-ROC 0.87 |
Injury rate↓12%, Flagged ↓30% |
SHAP Values |
|
Basketball decision making |
Logistic Regression | Optimal Choice ↑16% | Turnovers ↓22% |
Model Coefficients |
|
Athletics sprint mechanics |
Multivariate Linear Regression | 100 m Sprint Time Reduction ↓8% |
Joint Angles Optimized | Regression Coefficients |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).