Physics-Guided Machine Learning Surrogates for Bird Strike Analysis on Rotating Jet Engine Blades Through a Comparative Study of Lagrangian and SPH Simulations

MD Khalid Hasan Nabil; Jubayer Ahmed Sajid; Ivan Grgić; Jure Marijić; Saiaf Bin Rayhan

doi:10.20944/preprints202603.1182.v1

Submitted:

13 March 2026

Posted:

16 March 2026

You are already at the latest version

Abstract

Bird strike events on rotating jet engine fan blades pose significant risks to aviation safety, yet high-fidelity numerical simulations remain computationally expensive, limiting their use in parametric design studies. This study develops a physics-informed machine learning surrogate framework for predicting bird strike response on rotating Ti-6Al-4V fan blades, systematically comparing Lagrangian (gelatin-based, Mooney–Rivlin) and Smoothed Particle Hydrodynamics (SPH, water-like) formulations. A total of 100 explicit dynamic simulations were conducted in ANSYS LS-DYNA (50 per formulation), varying bird impact velocity and blade angular speed. Random Forest, Support Vector Regression, Polynomial Regression, and XGBoost regression models were trained and evaluated using five-fold cross-validation. Results demonstrate that SPH-based surrogates achieve superior predictive accuracy, with Random Forest yielding R² = 0.9938 for maximum deformation and R² = 0.9962 for total energy dissipation. In contrast, Lagrangian-based stress surrogates exhibited severe performance degradation (R² = 0.24) due to mesh-dependent numerical noise. The trained surrogates achieved computational speed-up factors of 10⁴–10⁵ relative to direct simulation. These findings establish that surrogate model reliability is fundamentally governed by the numerical quality of the training data, providing guidance for integrating machine learning with impact simulation workflows in aero-engine blade design.

Keywords:

bird strike

;

surrogate modeling

;

smoothed particle hydrodynamics

;

Lagrangian simulation

;

Random Forest

;

fan blade

;

machine learning

;

finite element analysis

Subject:

Engineering - Aerospace Engineering

1. Introduction

Bird strike events remain a persistent and serious hazard to aviation safety, particularly during engine ingestion at takeoff and landing. Impacts on rotating fan and compressor blades produce large transient loads, severe plastic deformation, and cascading failure modes that can lead to engine shutdown or uncontained damage. Review studies report that bird strikes account for nearly 90% of all foreign object damage incidents in aviation [1]. Aviation regulators address this through strict certification requirements. FAR 25.571 specifies damage-tolerance criteria requiring aircraft structures to withstand impacts from a 1.8 kg bird at prescribed velocities without loss of safe flight [2]. Large turbofan engines face similar demands, with rotating fan blades required to survive high-energy soft-body impacts without catastrophic disintegration [3]. Full-scale physical tests are costly, difficult to standardize, and ethically challenging due to variability in bird mass, geometry, and composition [1,4]. Computational modeling has therefore become an indispensable tool for blade survivability assessment in both preliminary and detailed design stages [3,4].

Early numerical bird strike studies used classical Lagrangian finite element formulations, where both the bird and target were discretized as deformable meshes. Modeling the bird as a homogeneous soft projectile with water-like properties reproduced key impact metrics at lower velocities, but severe mesh distortion limited robustness at high impact speeds [5,6]. The Arbitrary Lagrangian-Eulerian (ALE) formulation improved numerical stability by partially decoupling material motion from mesh deformation, achieving agreement within approximately 7–10% of experimental data for representative plate impact cases [7,8]. The recognition that birds behave predominantly as hydrodynamic projectiles at high velocities led to the widespread adoption of Smoothed Particle Hydrodynamics (SPH), where the bird is represented as a collection of Lagrangian particles governed by an equation of state, which naturally avoids mesh entanglement [9]. Extensive validation confirmed that SPH reproduces pressure histories, force-time responses, and residual deformation patterns consistent with experimental measurements [10,11,12]. SPH has since become the industry-standard approach for simulating bird impacts on rotating engine components, while Lagrangian and ALE methods remain relevant for comparative studies and computationally efficient analyses [5,8,13]. A significant advancement in Lagrangian modeling is the use of ballistic gelatin as a validated substitute bird material. Aslam et al. [14] demonstrated that a Mooney-Rivlin hyperelastic model for ballistic gelatin reproduces experimental impact data while reducing computation time compared to SPH. Mesh distortion was controlled through node erosion without compromising global response characteristics [14], making gelatin-based Lagrangian models a practical choice for parametric dataset generation.

Despite advances in high-fidelity simulation, detailed bird strike analyses of rotating blades remain computationally intensive, with runtimes ranging from hours to days per case [15,16]. This limits their applicability for design optimization and large parametric studies. Machine learning surrogate models address this by learning mappings between impact parameters and structural response quantities from simulation data, achieving speed-up factors of three to five orders of magnitude over direct simulation [17,18]. Several studies have shown that ML models trained on FEA outputs predict global response quantities such as maximum displacement and absorbed energy with errors below 5–10% [19,20,21]. However, most prior surrogate work focuses on a single numerical formulation, typically SPH, without examining how the underlying simulation methodology influences surrogate fidelity. Localized response quantities such as von Mises stress are particularly challenging for data-driven models because they are sensitive to mesh topology and numerical stabilization choices [22]. Under small-dataset constraints imposed by expensive simulations, ensemble learning methods such as Random Forest regression have demonstrated robustness over deep learning architectures, which require substantially more training data [23,24,25].

This study addresses these gaps by developing a physics-informed machine learning surrogate framework for predicting bird strike response of rotating jet engine fan blades, using parallel Lagrangian (gelatin-based) and SPH (water-like) datasets generated in ANSYS LS-DYNA. The primary objectives are: (i) to generate high-fidelity simulation datasets with both numerical formulations under validated material models for Ti-6Al-4V blades; (ii) to train and evaluate ensemble-based ML surrogates for global and localized response quantities including maximum deformation, von Mises stress, and internal energy; and (iii) to assess how numerical formulation quality propagates into surrogate learnability and generalization. This work demonstrates that surrogate model reliability is fundamentally governed by the numerical quality of training data, providing practical guidance for the selection of simulation methods in ML-augmented impact analysis workflows.

2. Materials and Methods

2.1. Overall Numerical Framework and Study Design

Explicit dynamic bird strike simulations were conducted in ANSYS LS-DYNA R2 to generate high-fidelity datasets for surrogate modeling of rotating jet-engine blade response. Two numerical formulations were implemented in parallel: (i) a Lagrangian finite element approach using a gelatin-based substitute bird and (ii) a Smoothed Particle Hydrodynamics (SPH) approach using a water-like bird model. The choice of SPH as a robust method for soft-body impact (avoiding mesh tangling) and the continued use of Lagrangian formulations for computational efficiency are consistent with established bird strike simulation literature [26,27,28,29]. A total of 100 simulations were executed: 50 Lagrangian cases and 50 SPH cases. Each simulation produced quantities of interest (QoIs) used later for machine learning training, including maximum total deformation and peak von Mises stress.

2.2. Blade Model: Geometry, Boundary Conditions, and Rotation

A three-dimensional rotating fan blade model representative of aero-engine fan applications was used as the impact target. Rotation was applied by prescribing a constant angular velocity about the engine axis; the investigated operating range was selected to remain consistent with published rotating fan/blade bird strike practices [27,29]. Boundary conditions were applied to represent hub/root attachment constraints, while allowing realistic bending and torsional deformation in the blade span. The bird was assigned an initial translational velocity aligned toward the blade impact region. Contact was modeled using a general penalty-based contact algorithm; friction effects were neglected because hydrodynamic momentum transfer dominates the short-duration soft-body impact response [3,4].

The geometric configuration of the rotating fan blade and the discretized numerical model used for the bird strike simulations are illustrated in Figure 1. The bird projectile was modeled using a hemispherical-ended cylindrical geometry commonly adopted in bird strike studies. To maintain realistic mass distribution and aspect ratio for soft-body impact modeling, the bird geometry was defined with dimensions of 134 mm in diameter and 268 mm in length, maintaining an approximate 1:2 diameter-to-length ratio, which is widely used in certification-style bird strike simulations.

For the numerical discretization, the blade structure was meshed using a characteristic element size of 15 mm, which was selected based on the mesh convergence study to balance computational accuracy and runtime for the large simulation dataset. The bird model was discretized using a 10 mm element size, allowing improved resolution of the soft-body deformation and momentum transfer during impact. This discretization strategy ensures stable contact interaction between the bird and blade while maintaining manageable computational cost for the explicit dynamic simulations.

2.3. Blade Material Model: Ti-6Al-4V with Johnson–Cook Strength and Failure

The blade material was Ti-6Al-4V and was modeled using a Johnson–Cook (JC) constitutive framework coupled with a JC damage/failure model, which is widely applied in fan-blade bird strike simulations to capture plasticity, strain-rate effects, thermal softening, and progressive failure [29]. The Johnson–Cook flow stress is defined as:

σ = (A + Bεⁿ)(1 + C ln(ε̇/ε̇₀))(1 − ((T − T₀)/(Tₘ − T₀))ᵐ) (1)

The Ti-6Al-4V parameters used in the simulations are listed in Table 1, Table 2 and Table 3.

This setup aligns with literature that emphasizes JC-based modeling for blade deformation and failure during bird ingestion scenarios [27].

2.4. Bird Geometry and Mass Consistency Across Formulations

The bird projectile was modeled using the widely adopted hemispherical-ended cylinder substitute geometry. This configuration has repeatedly been shown to be among the most reliable simplified bird models in terms of impact force/pressure representation across orientations and is widely used in numerical bird strike studies [26,28,29]. A constant bird mass was maintained across Lagrangian and SPH simulations so that differences in response could be attributed to formulation and constitutive representation rather than mass inconsistency.

2.5. Lagrangian Bird Model: Gelatin with Mooney–Rivlin Hyperelasticity

For the Lagrangian approach, the bird was modeled as ballistic gelatin using a two-parameter Mooney–Rivlin hyperelastic model, consistent with gelatin-based bird substitutes used in rotating blade simulations [27]. The strain energy density function is expressed as:

W = C₁₀(Ī₁ − 3) + C₀₁(Ī₂ − 3) + (1/D₁)(J − 1)² (2)

The material constants used in the Lagrangian bird model are listed in Table 4.

To control excessive mesh distortion typical of Lagrangian bird models at high velocity, erosion/deletion strategies were used where necessary to preserve simulation stability, an approach that has precedent in the literature for Lagrangian soft-body impact modeling [28].

2.6. SPH Bird Model: Water-Like Material with Hydrodynamic EOS

In the SPH formulation, the bird was modeled as a water-like material using a linear shock equation of state (EOS) with density 950 kg·m⁻³, Grüneisen coefficient 0.28, linear shock parameter C₁ = 1483 m·s⁻¹, Hugoniot slope coefficient S₁ = 1.75, quadratic parameter S₂ = 0, and maximum tensile pressure set to 0 Pa. This assumption is widely used and supported in SPH bird strike studies because it reproduces fluid-like spreading and avoids mesh tangling [26,28,29]. An appropriate EOS was employed to govern compressibility and pressure response under impact, consistent with standard SPH bird modeling workflows described in the literature [29].

2.7. Mesh Convergence Study

A mesh convergence study was conducted to select a blade element size that provides stable deformation predictions while maintaining feasible computational cost for the large parametric dataset. The bird model discretization was kept constant at an equivalent resolution of 10 mm for all cases to isolate the influence of the blade mesh refinement on the structural response. The convergence assessment was performed using the SPH formulation, and the quantity of interest was the maximum total deformation of the blade during the primary impact window.

The blade mesh was refined from 20 mm to 10 mm element size. The corresponding maximum deformation values are summarized in Table 5 and illustrated in Figure 2. Overall, the deformation response reduced sharply when refining from coarse meshes (20–18 mm) toward 15–12 mm, indicating that coarse discretizations overpredict deformation due to reduced stiffness representation and poorer resolution of impact-induced bending. Between 15 mm and 12 mm, the predicted deformation changed only marginally (0.317 m → 0.305 m), suggesting that the deformation response is approaching mesh independence in this refinement range. Although the 10 mm mesh produced a higher deformation value (0.345 m), this non-monotonic behavior can occur in explicit impact simulations due to local contact sensitivity, SPH–structure coupling effects, and differences in stress-wave resolution and numerical stabilization at very fine discretizations.

Considering the small variation between 15 mm and 12 mm, and the substantial increase in computational cost associated with finer meshes, a 15 mm blade element size was selected for all subsequent simulations used for surrogate-model dataset generation. This choice provides a practical balance between numerical accuracy and runtime, enabling execution of the full simulation matrix within available resources.

A dedicated time-step convergence study was not performed because the explicit solver uses a stability-limited time increment governed by the smallest characteristic element length and material wave speed, and performing systematic time-step refinement would multiply the computational expense of each impact case. Since the objective of the convergence study in this work was to ensure stability of the maximum deformation response for the SPH-based method under a fixed impact duration, mesh refinement was prioritized as the primary numerical verification step to support efficient dataset generation.

2.8. Dataset Generation and Reference Validation

For each formulation, 50 simulation cases were computed by varying bird velocity and blade angular speed within physically relevant ranges. The first extracted data point, at a bird velocity of 122.5 m·s⁻¹ and a blade angular speed of 395 rad·s⁻¹, was used for reference validation. The resulting structural responses are summarized in Table 6. These magnitudes are consistent with published rotating fan/blade bird strike outcomes, where severe impacts generate large transient deformations and stresses approaching or exceeding the gigapascal range depending on configuration, constraints, and failure modeling [13].

2.9. Machine Learning Surrogate Modeling Workflow

The objective of the machine learning (ML) framework is to construct surrogate models capable of approximating high-fidelity numerical simulation outputs with significantly reduced computational cost. Separate surrogate models were developed for datasets generated using the Lagrangian and SPH formulations in order to assess how numerical methodology influences surrogate learning behavior. The overall ML workflow consists of feature engineering, data preprocessing, model training, cross-validation, and performance evaluation. All models were trained exclusively on simulation-derived data, without incorporating experimental measurements, ensuring a consistent and controlled comparison between numerical formulations.

2.10. Feature Engineering and Physical Coupling of Input Parameters

For the Lagrangian-based dataset, the surrogate model inputs were constructed to reflect the coupled kinetic nature of the bird–blade impact system. In addition to the primary input parameters—bird impact velocity (Vᵇ) and blade angular velocity (ω)—interaction and nonlinear terms were introduced to capture higher-order effects observed in explicit dynamics simulations. The final input feature vector is expressed as:

X = {Vᵇ, ω, Vᵇ × ω, Vᵇ², ω²} (3)

This feature formulation enables the surrogate model to learn nonlinear energy transfer mechanisms arising from the combined translational and rotational motion of the system. Feature importance analysis confirms that these engineered variables represent joint physical contributions rather than independent effects. For the SPH-based dataset, the simulation outputs exhibited smoother response behavior, allowing the surrogate model to be trained directly using the physical input parameters without additional polynomial expansion. The input vector for SPH-based models consists of the blade angular velocity component and bird impact velocity component aligned with the impact direction.

2.11. Data Preprocessing and Collinearity Assessment

Prior to model training, all datasets were examined for multicollinearity to ensure numerical stability and interpretability of the surrogate models. Collinearity tests confirmed the expected correlation between physically coupled variables (e.g., velocity and energy-related features), which is intrinsic to the problem formulation rather than an artifact of data construction (Figure 3 and Figure 4). Output variables were normalized to consistent engineering units to facilitate stable training and comparison across targets. Specifically, von Mises stress values were converted to MPa, and total energy values were converted to kJ. No aggressive dimensionality reduction techniques were applied, as preserving physical interpretability was prioritized over statistical compactness given the small dataset size.

2.12. Training-Testing Split and Cross-Validation Strategy

Due to the limited dataset size (n = 50 per numerical formulation), an 80/20 training-testing split was employed for Random Forest, SVR, and Polynomial Regression, while a 90/10 split was used for XGBoost to balance model learning capacity with unbiased performance evaluation. To further assess model robustness and mitigate variance introduced by random sampling, five-fold cross-validation was applied consistently across all surrogate models. The use of k-fold cross-validation is particularly important in small-sample regimes, where single train-test splits may lead to misleading performance estimates. Cross-validation metrics are therefore reported alongside test-set performance throughout this study.

2.13. Surrogate Models Evaluated

Multiple regression models were evaluated to identify the most suitable surrogate architecture for predicting bird strike response quantities. The selected models represent a balance between model complexity, interpretability, and robustness under limited data availability. Hyperparameters were determined through an iterative process of performance evaluation; by systematically adjusting parameters, the configuration was optimized to achieve a balance between predictive accuracy and model generalization, ensuring the prevention of overfitting on the training data validated by k-fold cross-validation.

2.13.1. Random Forest Regression

Random Forest (RF) regression was selected as the primary surrogate model due to its ensemble-based architecture and demonstrated robustness in small and noisy datasets. The RF model was implemented using a fixed-depth ensemble of decision trees to prevent overfitting while maintaining nonlinear learning capability. The optimal hyperparameters were selected based on cross-validation performance and kept consistent across SPH and Lagrangian datasets (Table 7).

Although Random Forest inherently supports multi-output regression, separate single-output regressors were trained for each response variable to maintain clarity in performance assessment.

2.13.2. Support Vector Regression with Radial Basis Function Kernel

Support Vector Regression (SVR) with a radial basis function (RBF) kernel was evaluated as a nonlinear kernel-based baseline. While SVR demonstrated reasonable performance for select SPH targets, it exhibited instability and sensitivity to noise in Lagrangian-based von Mises stress prediction, particularly under cross-validation (Table 8).

2.13.3. Polynomial Regression

Polynomial regression models were included to assess the adequacy of low-order parametric approximations. Despite their simplicity, these models showed limited generalization capability for highly nonlinear response quantities and were prone to overfitting under small-sample conditions. The input feature space was expanded using a polynomial transformation of degree d = 2.

2.13.4. Extreme Gradient Boosting (XGBoost)

Extreme Gradient Boosting (XGBoost) was evaluated due to its strong performance in structured regression tasks. However, the method exhibited inconsistent behavior across different response variables and was particularly sensitive to noisy stress data derived from Lagrangian simulations (Table 9).

2.14. Model Evaluation Metrics

Surrogate model performance was assessed using standard regression metrics to provide a comprehensive evaluation of accuracy and robustness. The coefficient of determination (R²), root mean square error (RMSE), and mean absolute error (MAE) were computed for all response variables. All metrics are reported for both the held-out test set and five-fold cross-validation (Table 10 and Table 11). Discrepancies between test-set and cross-validation performance were used as indicators of overfitting and data noise sensitivity.

3. Results

3.1. Validation of High-Fidelity Numerical Simulations

The high-fidelity simulations were validated against published rotating fan-blade bird strike studies, with particular reference to the numerical investigations of Shahimi et al. [27], Wu et al. [13], and the soft-body impact framework of Badshah et al. [26]. Validation focused on deformation magnitude, peak stress levels, and qualitative impact behavior at the reference operating point (bird velocity 122.5 m·s⁻¹; blade angular speed 395 rad·s⁻¹).

The predicted maximum deformations and peak von Mises stresses at the reference operating point are consistent with published rotating blade simulations in the 100–150 m·s⁻¹ range, exhibiting bending-dominated mode shapes with maximum deflection near the blade tip and stress amplification toward the root [13,26,27].

The spatial distributions of total deformation and von Mises stress obtained from the explicit dynamic simulations are shown in Figure 5 and Figure 6, respectively, for an impact velocity of 145 m·s⁻¹. The deformation field highlights significant bending near the blade tip and mid-span regions, where the impact energy is primarily transferred, with maximum deformation occurring toward the outer blade region due to reduced structural stiffness and increased lever-arm effects under rotational loading. Stress contours reveal localization near the impact zone and along the blade root, reflecting the combined effects of local impact loading and global bending stresses induced by blade rotation.

The SPH model reproduced characteristic hydrodynamic spreading behavior widely reported in SPH-based bird strike studies [9,26]. The Lagrangian gelatin model produced larger deformation due to constitutive differences between the hyperelastic gelatin and the water-like EOS material, but both approaches yielded physically comparable magnitudes. Exact numerical replication of published results is not expected given differences in blade geometry, mesh density, material parameters, and contact settings. All simulations were terminated at 0.001 s to capture the primary impact phase, during which peak deformation and peak stress occur.

3.2. Surrogate Model Performance for Global Response Quantities

Global response quantities — maximum total deformation and total energy dissipation — represent integral measures of the bird strike event and are less sensitive to localized numerical fluctuations. The predictive performance of all evaluated surrogate models for these quantities is summarized in Table 10 and Table 11.

3.2.1. Maximum Total Deformation

For the Lagrangian dataset, Random Forest regression achieved a test-set R² = 0.770 with a five-fold cross-validation R² = 0.644, and RMSE below 0.10 m for both test and cross-validation cases (Table 10). Polynomial regression produced comparable test-set performance (R² = 0.769) but reduced robustness under cross-validation, indicating sensitivity to sampling variability. SVR and XGBoost showed negative or highly inconsistent R² values for this target, indicating poor generalization under small-sample conditions.

For the SPH dataset, Random Forest achieved a test-set R² = 0.994 and a five-fold cross-validation R² = 0.989, with RMSE below 0.005 m (Table 11). This level of agreement between test-set and cross-validation performance confirms that the Random Forest surrogate generalizes reliably beyond the training partition. The performance gap between SPH and Lagrangian-based deformation surrogates reflects the smoother deformation response inherent to SPH simulations, which enables more stable surrogate training.

Predicted versus actual deformation values for all surrogate models are shown in Figure 7 and Figure 8 for Lagrangian and SPH datasets, respectively.

3.2.2. Total Energy Dissipation

Total energy dissipation was predicted with high accuracy across both formulations. For the Lagrangian dataset, Random Forest achieved a test-set R² = 0.988 and a five-fold cross-validation R² = 0.992, with RMSE values of 47.54 kJ and 40.49 kJ, respectively (Table 10). The consistency between test and validation performance indicates that global energy metrics are less affected by the localized numerical artifacts that compromise stress prediction.

For the SPH dataset, Random Forest achieved a test-set R² = 0.996 and a five-fold cross-validation R² = 0.995 (Table 11). SVR performed reasonably well for SPH energy prediction (R² = 0.966 test, R² = 0.970 CV). Polynomial regression and XGBoost exhibited unstable test-set behavior despite strong cross-validation scores, which indicates sensitivity to data partitioning rather than genuine generalization capacity. Energy dissipation emerges as the most reliably predicted quantity across both formulations, attributable to its global integral nature that averages over system-wide behavior.

Predicted versus actual total energy values are shown in Figure 9 and Figure 10 for Lagrangian and SPH datasets, respectively.

3.3. Surrogate Model Performance for Von Mises Stress Prediction

Maximum von Mises stress is sensitive to numerical discretization, mesh quality, and localized deformation behavior, and represents a more challenging prediction target for data-driven surrogate models compared to global quantities.

For the SPH dataset, Random Forest regression achieved a test-set R² = 0.915 and a five-fold cross-validation R² = 0.820, with RMSE values of 18.37 MPa and 21.14 MPa, respectively (Table 11). The consistency between test-set and cross-validation metrics confirms that the trained model generalizes beyond specific data partitions. SVR with an RBF kernel produced a competitive cross-validation R² = 0.978, though test-set metrics were more variable. Polynomial regression and XGBoost showed unstable test-set behavior for this target.

For the Lagrangian dataset, surrogate models exhibited pronounced performance degradation. Random Forest achieved a test-set R² = 0.236, despite a five-fold cross-validation R² = 0.799 (Table 10). This large discrepancy is indicative of severe overfitting driven by high-frequency numerical noise introduced by mesh distortion, element deletion, and stress localization artifacts during soft-body impact [22]. SVR, polynomial regression, and XGBoost showed similar or worse instability, with negative test-set R² values observed in several cases. The degradation in Lagrangian stress surrogate performance is therefore not a consequence of insufficient model capacity, but reflects a fundamental limitation imposed by the numerical characteristics of the underlying simulation data.

Predicted versus actual von Mises stress values are shown in Figure 11 and Figure 12 for SPH and Lagrangian datasets, respectively.

3.4. Feature Importance Analysis

Feature importance analysis was conducted for all Random Forest models trained on both SPH and Lagrangian datasets to assess physical interpretability of the trained surrogates.

For SPH-based models, blade angular velocity (ω) emerged as the dominant predictor across all response quantities, with importance scores between 0.55 and 0.56, followed by bird impact velocity (Vᵇ). This ranking is physically consistent: the relative velocity between the rotating blade and the incoming bird projectile is governed primarily by the rotational component, and kinetic energy transfer at the blade surface scales accordingly.

For Lagrangian-based models, the engineered interaction term Vᵇ × ω and the quadratic angular velocity component ω² exhibited elevated importance for deformation and energy predictions, reflecting the nonlinear coupled contribution of translational and rotational motion to structural response. For von Mises stress prediction, feature importance rankings varied significantly across cross-validation folds, confirming that the model learns from numerical noise rather than from consistent physical trends.

3.5. Computational Speed-Up

Individual bird strike simulations required between 22 and 33 minutes of wall-clock time per case, depending on impact conditions and numerical formulation. Once trained, the Random Forest surrogate models produced predictions within fractions of a second on a standard desktop CPU. For a representative simulation runtime of 25–30 minutes and an ML inference time on the order of milliseconds, the resulting speed-up factor lies in the range of O(10⁴) to O(10⁵), consistent with speed-up factors reported for ML surrogates applied to high-fidelity impact simulations [17,18].

The predictive accuracy of the trained surrogates was further evaluated using structured validation cases spanning edge-of-domain, interpolative, off-diagonal, and extrapolative conditions within the input parameter space. Results are summarized in Table 12 and Table 13 and Table 14 for Lagrangian deformation, Lagrangian stress, and SPH-based predictions, respectively. For both formulations, deformation predictions showed good agreement with direct simulation results across most validation cases. Stress predictions showed larger deviations, particularly for the SPH extrapolation case, which is consistent with the broader performance pattern observed during full dataset evaluation.

4. Discussion

4.1. Surrogate Model Performance in Context of Prior Work

Random Forest regression achieved the strongest overall performance across both numerical datasets in this study, consistent with recent findings that tree-based ensemble methods outperform deep learning on small tabular datasets [23,24,25]. For SPH-derived data, test-set coefficients of determination exceeded R² = 0.99 for maximum deformation and total energy dissipation, placing the present results among the highest reported accuracies for impact surrogate models trained on fewer than 100 samples. Vurtur Badarinath et al. [17] reported comparable accuracy for beam-level FEA surrogates using similar ensemble approaches, and Pana et al. [23] demonstrated that Random Forest models trained on FEA lattice data generalize reliably within the trained parameter space. Garg et al. [30] further showed that Random Forest surrogates reliably transform structural response predictions across different theoretical frameworks for composite plates and shells, confirming the generalization capacity of tree-based ensemble methods for FEA-derived structural quantities. The present results extend these findings to the more demanding context of rotating blade soft-body impact, where input-output relationships are governed by coupled translational and rotational dynamics.

4.2. Divergence Between SPH and Lagrangian Surrogate Performance

The most significant finding of this study is the pronounced divergence in surrogate performance between SPH and Lagrangian datasets, particularly for von Mises stress prediction. While SPH-based Random Forest surrogates achieved R² = 0.915 for stress, Lagrangian-based models degraded to R² = 0.236 on the test set, despite a cross-validation R² of 0.799. This discrepancy cannot be attributed to model architecture or hyperparameter selection, since identical configurations were used for both datasets. The root cause lies in the numerical characteristics of the Lagrangian simulation data itself. Mesh distortion, element deletion, and hourglass control artifacts introduce high-frequency spatial and temporal fluctuations in the stress field that are not reproducible by data-driven models trained on the global maximum stress extracted per simulation. Liang et al. [22] observed a similar degradation mechanism in deep learning stress surrogates when input fields deviate from smoothness, and Siemann and Ritt [9] confirmed that SPH formulations yield inherently smoother particle-level stress distributions compared to Lagrangian meshes under large deformation [31]. The present study provides direct quantitative evidence of this effect in a rotating blade bird strike context, which to the authors’ knowledge has not been previously demonstrated.

4.3. Global vs. Localized Response Quantities

Total energy dissipation was predicted with consistently high accuracy for both numerical formulations, with Random Forest achieving R² = 0.988 (Lagrangian) and R² = 0.996 (SPH) on the test set. This contrasts sharply with the stress prediction results and reflects the integral nature of energy metrics, which average over the entire system and are therefore less sensitive to localized numerical artifacts. Maximum total deformation occupies an intermediate position: SPH deformation prediction is excellent (R² = 0.994), while Lagrangian deformation accuracy is moderate (R² = 0.770). The spatial averaging inherent to global deformation reduces, but does not eliminate, the influence of mesh-dependent variability. Wu et al. [13] reported that peak deformation and kinetic energy metrics derived from SPH simulations of rotating fan blades exhibit smooth parametric trends across velocity and rotational speed, which is consistent with the learnability observed for the SPH dataset in this study. Guida et al. [11] further confirmed that SPH-based global response metrics agree with experimental results within 10%, reinforcing the reliability of SPH data as surrogate training input.

4.4. Feature Importance and Physical Interpretability

Feature importance analysis for SPH-based Random Forest models identified blade angular velocity (ω) as the dominant predictor across all response quantities, with importance scores between 0.55 and 0.56. This ranking is physically consistent with the governing dynamics of rotating blade bird strike: the rotational component determines the relative impact velocity between the bird and blade surface, and the resulting kinetic energy transfer scales with the square of this relative velocity. For Lagrangian-based models, the engineered interaction term Vᵇ × ω and the quadratic component ω² showed elevated importance for deformation and energy predictions, capturing the nonlinear coupling between translational and rotational motion. However, for Lagrangian von Mises stress prediction, feature importance rankings varied significantly across cross-validation folds, confirming that the model learns from numerical noise rather than from consistent physical trends. This instability in feature rankings is a diagnostic indicator of surrogate unreliability and should be considered a practical screening criterion when evaluating surrogate models for impact applications.

4.5. Computational Speed-Up and Practical Applicability

Individual bird strike simulations in this study required between 22 and 33 minutes per case. Once trained, Random Forest surrogates produced predictions within fractions of a second, yielding speed-up factors in the range O(10⁴)–O(10⁵). This acceleration is consistent across interpolative, off-diagonal, and edge-of-domain validation cases, confirming that the trained surrogates generalize reliably within the sampled parameter space. Speed-up factors of this magnitude have been reported in related impact mechanics applications [17,18,32] and enable parametric studies that would require weeks of direct simulation to be completed in minutes. For rotating blade bird strike specifically, this capability is relevant to early-stage design exploration and preliminary certification screening, where rapid evaluation of hundreds of impact parameter combinations is needed before high-fidelity verification. The structured validation approach employed here — spanning edge-of-domain, interpolative, extrapolative, and off-diagonal cases — provides a more rigorous characterization of surrogate applicability than single held-out test splits, and is recommended as a standard evaluation protocol for impact surrogates.

4.6. Limitations and Future Directions

Several limitations of the present study should be acknowledged. The surrogate models were trained and validated exclusively within the sampled parameter space defined by bird velocity and blade angular speed. Extrapolation beyond the training bounds, as demonstrated for Case D, produced larger prediction errors particularly for von Mises stress, which is consistent with the known limitations of ensemble regression models in extrapolative regimes. The present study also considers only two input parameters; operational bird strike scenarios involve additional variables including bird orientation, impact location along the blade span, and material variability, all of which would increase the required dataset size and complicate surrogate training. The dataset size of n = 50 per formulation was constrained by available computational resources and is at the lower boundary for reliable ensemble model training. Physics-Informed Neural Networks (PINNs) offer a promising pathway for improving localized stress prediction by embedding constitutive constraints directly into the learning process [33], potentially reducing the sensitivity to numerical noise observed in Lagrangian-based surrogates. Graph Neural Networks (GNNs) represent another avenue for capturing spatially distributed structural response while respecting mesh topology. Future work should also address more diverse input sampling strategies, such as Latin hypercube or quasi-random designs, to improve coverage of the parameter space and reduce surrogate variance at the training boundaries. Recent work by Wu et al. [29] on flocking bird strikes demonstrates the complexity of multi-bird ingestion scenarios, which represent a further extension of the present surrogate framework.

5. Conclusions

This study developed a physics-informed machine learning surrogate framework to predict the structural response of rotating Ti-6Al-4V fan blades subjected to bird strike loading, using parallel simulation datasets generated via Lagrangian gelatin-based and SPH water-like explicit dynamic formulations in ANSYS LS-DYNA. A total of 100 high-fidelity simulations were conducted across both formulations, with four surrogate architectures trained and evaluated using five-fold cross-validation.

The central finding is that surrogate model reliability is governed primarily by the numerical consistency of the training data, rather than by model architecture or hyperparameter tuning. SPH-based Random Forest surrogates achieved near-perfect predictive accuracy for global response quantities and maintained strong performance for localised stress prediction. In contrast, Lagrangian-based stress surrogates showed severe degradation on the test set despite acceptable cross-validation scores. This performance loss is directly attributable to high-frequency numerical noise inherent in Lagrangian soft-body impact simulations, arising from mesh distortion, element deletion, and stress localisation artefacts, rather than any deficiency in the surrogate architecture itself.

mong the evaluated models, Random Forest regression demonstrated the most consistent generalisation across both numerical formulations and all response quantities, confirming the robustness of tree-based ensemble methods under small-dataset constraints. The response quantity hierarchy with respect to surrogate learnability follows a physically interpretable pattern: total energy dissipation, as an integral system-level quantity, is the most reliably predicted; maximum deformation occupies an intermediate position; and von Mises stress remains the most challenging target due to its sensitivity to local numerical discretisation.

Feature importance analysis reinforced the physical consistency of SPH-based surrogates, with blade angular velocity identified as the dominant predictor, consistent with the governing dynamics of rotating blade impact. Lagrangian-based stress models exhibited unstable feature importance rankings across cross-validation folds, serving as a practical diagnostic indicator that surrogate variance is driven by numerical noise rather than recoverable physical trends.

The trained surrogates achieved computational speed-up factors in the range of 10⁴ to 10⁵ relative to direct simulation, enabling parametric exploration across hundreds of impact combinations within minutes. This capability has direct implications for early-stage aero-engine blade design and preliminary certification screening workflows.

These results establish that the selection of numerical simulation methodology is a critical upstream decision in any ML-augmented impact analysis pipeline. SPH formulations produce smooth, physically consistent response fields that yield reliable surrogate training data, while Lagrangian methods introduce numerical artefacts that fundamentally constrain surrogate learnability for localised response quantities. Future work should extend the framework to additional input parameters, including bird orientation and spanwise impact location, investigate physics-informed neural network architectures for improved stress prediction, and employ space-filling Latin hypercube sampling strategies to improve generalisation under extrapolation conditions.

Supplementary Materials

The following supporting information can be downloaded at the website of this paper posted on Preprints.org. The interactive surrogate model dashboard supporting this study is available at https://birdstrike-dashboard.vercel.app. The source code of the application is archived on Zenodo at https://doi.org/10.5281/zenodo.18937498.

Author Contributions

Conceptualization, K.H.N. and J.A.S.; methodology, K.H.N and J.A.S.; software, K.H.N and J.A.S; validation, K.H.N. and J.A.S; formal analysis K.H.N. and J.A.S; investigation, K.H.N. and J.A.S; resources, K.H.N. and J.A.S; data curation, K.H.N. and J.A.S; writing—original draft preparation, K.H.N. and J.A.S; writing—review and editing, I.G., J.M. and S.B.R.; visualization, I.G.; supervision, I.G. and S.B.R.; funding acquisition, I.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research paper was funded by the University of Slavonski Brod through the institutional research project Analysis of the influence of design and process parameters of FDM technology on the mechanical and vibrational properties of polyamide PA6 tooth of a cylindrical spur gear for the purpose of optimizing the hybrid infill structure design (ASPFDM-PA6), financed by the European Union – NextGenerationEU. The views and opinions expressed in this paper are those of the author and do not necessarily reflect the official position of the European Union or the European Commission. Neither the European Union nor the European Commission can be held responsible for them.

Data Availability Statement

The simulation datasets supporting the results of this study are available from the corresponding author upon reasonable request.

Acknowledgments

The authors reviewed and edited all AI-generated content and take full responsibility for the final version of the manuscript.

Conflicts of Interest

“The authors declare no conflicts of interest.” “The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results”.

References

Boyacı, E.; Altın, M. Experimental and Numerical Approach on Bird Strike: A Review. Int. J. Automot. Sci. Technol. 2023, 7, 95–103. [Google Scholar] [CrossRef]
Walvekar, V.; Thorbole, C.K.; Bhonge, P.; Lankarani, H.M. Birdstrike Analysis on Leading Edge of an Aircraft Wing Using a Smooth Particle Hydrodynamics Bird Model. In Proceedings of the ASME 2010 International Mechanical Engineering Congress and Exposition (IMECE2010), Vancouver, BC, Canada, 12–18 November 2010; Paper No. IMECE2010-37667, pp. 77–87. [Google Scholar]
Niering, E. Simulation of Bird Strikes on Turbine Engines. J. Eng. Gas Turbines Power 1990, 112, 573–578. [Google Scholar] [CrossRef]
Holmes, I.; Whisler, D. Simulating Bird Strikes Using Smoothed Particle Hydrodynamics for Improved Aircraft Safety. In Proceedings of the SPIE 11380, Nondestructive Characterization and Monitoring of Advanced Materials, Aerospace, Civil Infrastructure, and Transportation XIV, Online, 27 April–8 May 2020; p. 113800M. [Google Scholar] [CrossRef]
Ryabov, A.; Romanov, V.; Kukanov, S.; Shmotin, Y.; Chupin, P. Fan Blade Bird Strike Analysis Using Lagrangian, SPH and ALE Approaches. In Proceedings of the 6th European LS-DYNA Users Conference, Gothenburg, Sweden, 29–30 May 2007. [Google Scholar]
Riccio, A.; Cristiano, R.; Saputo, S. A Brief Introduction to the Bird Strike Numerical Simulation. Am. J. Eng. Appl. Sci. 2016, 9, 946–950. [Google Scholar] [CrossRef]
Huertas-Ortecho, C.A. Robust Bird-Strike Modeling Using LS-DYNA. Master’s Thesis, University of Puerto Rico, Mayagüez, Puerto Rico, 2006. [Google Scholar]
Goyal, V.K.; Huertas, C.A.; Vasko, T.J. Arbitrary Lagrange Eulerian Approach for Bird-Strike Analysis Using LS-DYNA. Am. Trans. Eng. Appl. Sci. 2013, 2, 109–132. [Google Scholar]
Siemann, M.H.; Ritt, S.A. Novel Particle Distributions for SPH Bird-Strike Simulations. Comput. Methods Appl. Mech. Eng. 2019, 343, 746–766. [Google Scholar] [CrossRef]
Audic, S.; Berthillier, M.; Bonini, J.; Bung, H.; Combescure, A. Prediction of Bird Impact in Hollow Fan Blades. In Proceedings of the 36th AIAA/ASME/SAE/ASEE Joint Propulsion Conference and Exhibit, Huntsville, AL, USA, 17–19 July 2000; pp. AIAA 2000–3201. [Google Scholar]
Guida, M.; Marulo, F.; Belkhelfa, F.Z.; Russo, P. A Review of the Bird Impact Process and Validation of the SPH Impact Model for Aircraft Structures. Prog. Aerosp. Sci. 2022, 129, 100787. [Google Scholar] [CrossRef]
McCarthy, M.A.; Xiao, J.R.; McCarthy, C.T.; Kamoulakos, A.; Ramos, J.; Gallard, J.P.; Melito, V. Modeling of Bird Impacts on an Aircraft Wing—Part II: Modeling the Impact with an SPH Bird Model. Int. J. Crashworthiness 2005, 10, 51–59. [Google Scholar] [CrossRef]
Wu, B.; Hedayati, R.; Li, Z.; Zhang, J.; Zhong, Z. Dynamic Responses of the Aero-Engine Rotor System to Bird Strike on Fan Blades at Different Rotational Speeds. Appl. Sci. 2021, 11, 8883. [Google Scholar] [CrossRef]
Aslam, M.A.; Rayhan, S.B.; Mohd Zain, M.Z.; Alias, A.; Ramli, A.S.; Ahmad, F. Ballistic Gelatin Lagrange Mooney-Rivlin Material Model as a Substitute of Bird in Finite Element Bird Strike Case Studies. Lat. Am. J. Solids Struct. 2020, 17, e298. [Google Scholar] [CrossRef]
Abdullah, N.A.; Yusoff, M.D.; Shahimi, S.S.; Meor Ahmad, M.I. Numerical Modelling of Bird Strike on Aerospace Structures by Means of Coupling FE-SPH. Int. J. Integr. Eng. 2021, 13, 185–193. [Google Scholar] [CrossRef]
Heimbs, S. Computational methods for bird strike simulations: A review. Computers & Structures 2011, 89, 2093–2112. [Google Scholar] [CrossRef]
Vurtur Badarinath, P.; Chierichetti, M.; Davoudi Kakhki, F. A Machine Learning Approach as a Surrogate for a Finite Element Analysis: Status of Research and Application to One Dimensional Systems. Sensors 2021, 21, 1654. [Google Scholar] [CrossRef]
Martinez-Gonzalez, D.A.; Jude, D.; Wissink, A.M. ROAM-ML: A Reduced Order Aerodynamic Module Augmented with Neural Network Digital Surrogates. In Proceedings of the AIAA SCITECH 2022 Forum, San Diego, CA, USA, 3–7 January 2022; p. 1248. [Google Scholar] [CrossRef]
Ryan, S.; Thaler, S.; Kandanaarachchi, S. Machine Learning Methods for Predicting the Outcome of Hypervelocity Impact Events. Expert Syst. Appl. 2016, 45, 23–39. [Google Scholar] [CrossRef]
Santos, L. Deep and Physics-Informed Neural Networks as a Substitute for Finite Element Analysis. In Proceedings of the 2024 9th International Conference on Machine Learning Technologies, Oslo, Norway, 24–26 May 2024; pp. 84–90. [Google Scholar] [CrossRef]
Peters, N.; Wissink, A.; Ekaterinaris, J. Comparison of Data-Driven Approaches to Rotorcraft Store Separation Modeling. J. Aircr. 2024, 61. [Google Scholar] [CrossRef]
Liang, L.; Liu, M.; Martin, C.; Sun, W. A deep learning approach to estimate stress distribution: a fast and accurate surrogate of finite-element analysis. J. R. Soc. Interface 2018, 15, 20170844. [Google Scholar] [CrossRef]
Pana, S.; Duy, V.; Thongchai, F.; Ramnarong, W.; Yuttana, M.; Tossapon, K.; Nakorn, T.; Itthidet, T. The role of machine learning for insight into the material behavior of lattices: A surrogate model based on data from finite element simulation. Results in Engineering 2024, 23, 102547. [Google Scholar] [CrossRef]
Grinsztajn, L.; Oyallon, E.; Varoquaux, G. Why Do Tree-Based Models Still Outperform Deep Learning on Typical Tabular Data? In Advances in Neural Information Processing Systems 35 (NeurIPS 2022); Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2022; pp. 507–520. [Google Scholar]
Shwartz-Ziv, R.; Armon, A. Tabular Data: Deep Learning Is Not All You Need. Inf. Fusion 2022, 81, 84–90. [Google Scholar] [CrossRef]
Badshah, S.; Naeem, A.; Farhan Rafique, A.; Ul Haq, I.; Abdullah Malik, S. Numerical study on the critical frequency response of jet engine rotors for blade-off conditions against bird strike. Appl. Sci. 2019, 9, 5568. [Google Scholar] [CrossRef]
Shahimi, S.S.; Abdullah, N.A.; Topa, A.; Hrairi, M.; Ismail, A.F. Numerical modelling of bird strike on a rotating engine blades based on variations of porosity density. IIUM Eng. J. 2022, 23. [Google Scholar] [CrossRef]
Hedayati, R.; Ziaei-Rad, S. A new bird model and the effect of bird geometry in impacts from various orientations. Aerosp. Sci. Technol. 2013, 28, 9–20. [Google Scholar] [CrossRef]
Wu, B.; Lin, J.; Xie, A.; Wang, N.; Zhang, G.; Zhang, J.; Dai, H. Flocking bird strikes on engine fan blades and their effect on the rotor system: a numerical simulation. Aerospace 2022, 9, 90. [Google Scholar] [CrossRef]
Garg, A.; Mukhopadhyay, T.; Belarbi, M.O.; Li, L. Random forest-based surrogates for transforming the behavioral predictions of laminated composite plates and shells from FSDT to Elasticity solutions. Composite Structures 2023, 309, 116756. [Google Scholar] [CrossRef]
Feng, R.; Fourtakas, G.; Rogers, B.D.; Lombardi, D. Large Deformation Analysis of Granular Materials with Stabilized and Noise-Free Stress Treatment in Smoothed Particle Hydrodynamics (SPH). Comput. Geotech. 2021, 138, 104356. [Google Scholar] [CrossRef]
Huang, L.; Li, G.; Guan, Y.; Jiao, W.; Gong, S.; Xu, S. Machine learning and finite element integration-driven surrogate model for fluid-structure interaction seismic response analysis of aqueduct structures. Results Eng. 2025, 27, 106176. [Google Scholar] [CrossRef]
Bolandi, H.; Sreekumar, G.; Li, X.; Lajnef, N.; Boddeti, V.N. Physics Informed neural network for dynamic stress prediction. Applied Intelligence 2023, 53, 26313–26328. [Google Scholar] [CrossRef]

Figure 1. (A) CAD model of the rotating fan blade geometry and (B) discretized finite element model used for bird strike simulations.

Figure 2. Blade mesh convergence study: maximum total deformation as a function of blade element size. The 15 mm mesh was selected for all simulations.

Figure 3. Pearson correlation matrix for the Lagrangian-based simulation dataset features.

Figure 4. Pearson correlation matrix for the SPH-based simulation dataset features.

Figure 5. Total deformation distribution of the rotating fan blade following bird impact at an impact velocity of 145 m·s⁻¹.

Figure 6. Von Mises stress distribution in the rotating fan blade during bird strike at an impact velocity of 145 m·s⁻¹.

Figure 7. Predicted vs. actual maximum total deformation for all surrogate models — Lagrangian dataset.

Figure 8. Predicted vs. actual maximum total deformation for all surrogate models — SPH dataset.

Figure 9. Predicted vs. actual total energy dissipation for all surrogate models — Lagrangian dataset.

Figure 10. Predicted vs. actual total energy dissipation for all surrogate models — SPH dataset.

Figure 11. Predicted vs. actual von Mises stress for all surrogate models — SPH dataset.

Figure 12. Predicted vs. actual von Mises stress for all surrogate models — Lagrangian dataset.

Table 1. Elastic and thermal properties of Ti-6Al-4V.

Property	Value
Density	4420 kg·m⁻³
Young’s modulus	9.6 × 10¹⁰ Pa
Poisson’s ratio	0.36
Specific heat (Cp)	612 J·kg⁻¹·K⁻¹

Table 2. Johnson–Cook strength model parameters for Ti-6Al-4V.

Parameter	Value
A (Yield stress)	1098 MPa
B (Hardening constant)	1092 MPa
n (Hardening exponent)	0.93
C (Strain rate constant)	0.014
m (Thermal softening exponent)	1.1
Tₘ (Melt temperature)	1878 K
ε̇₀ (Reference strain rate)	1 s⁻¹

Table 3. Johnson–Cook damage/failure model parameters for Ti-6Al-4V.

Parameter	Value
D₁ (damage strain coefficient)	0.112
D₂ (damage strain coefficient)	0.123
D₃ (damage strain coefficient)	0.48
D₄ (strain rate coefficient)	0.014
D₅ (temperature coefficient)	3.87
Tₘ (melt temperature)	1878 K

Table 4. Material properties and Mooney–Rivlin hyperelastic constants for the bird model.

Property/Parameter	Value
Density	968 kg·m⁻³
Mooney–Rivlin Constant (C₁₀)	2.18 × 10⁵ Pa
Mooney–Rivlin Constant (C₀₁)	8.05 × 10⁴ Pa
Compressibility Parameter (D₁)	1.45 × 10⁻⁸ Pa⁻¹

Table 5. Comparison of blade mesh densities evaluated for mesh convergence.

Mesh Size (mm)	Max. Total Deformation (m)	Notes
20	0.718	Coarse
18	0.541
15	0.317	Selected
12	0.305
10	0.345	Non-monotonic

Table 6. Validation of simulation responses at the reference operating point (Vᵇ = 122.5 m·s⁻¹, ω = 395 rad·s⁻¹).

Formulation (Bird Material)	Max. Deformation (m)	Von Mises Stress (Pa)
Lagrangian (Gelatin)	0.568	1.11 × 10⁹
SPH (Water-like)	0.317	9.65 × 10⁸

Table 7. Hyperparameters for Random Forest (RF) models applied to Lagrangian and SPH datasets.

Parameter	Value
Number of trees	25
Maximum tree depth	5
Minimum samples per leaf	2
Feature selection	Square-root criterion
Random state	42

Table 8. Hyperparameters for Support Vector Regression (SVR) models.

Parameter	Lagrangian	SPH
Kernel	Radial Basis Function (RBF)	Radial Basis Function (RBF)
Kernel coefficient (γ)	0.1	0.1
Regularization parameter (C)	160	160
Margin of tolerance (ε)	0.1	0.1

Table 9. Hyperparameters for XGBoost models applied to Lagrangian and SPH datasets.

Parameter	Lagrangian	SPH
Loss function	Squared error	Squared error
Boosting rounds	25	25
Shrinkage (learning rate)	0.1	0.1
Max. tree depth	5	5
Random state	42	42
Train/test split	90/10	90/10

Table 10. Regression performance metrics for Lagrangian dataset.

Model	Target	R² (test)	R² (CV 5f)	RMSE (test)	RMSE (CV 5f)
RF	Max. Deformation (m)	0.770	0.644	0.0749	0.0953
RF	Von Mises Stress (MPa)	0.236	0.799	30.88	31.28
RF	Total Energy (kJ)	0.988	0.992	47.54	40.49
SVR	Max. Deformation (m)	−0.371	−0.371	0.183	0.113
SVR	Von Mises Stress (MPa)	−0.515	0.754	43.48	29.34
SVR	Total Energy (kJ)	0.682	0.937	245.8	n/a ¹
Poly. Reg.	Max. Deformation (m)	0.769	0.611	0.0872	0.1019
Poly. Reg.	Von Mises Stress (MPa)	−0.770	0.704	35.43	31.67
Poly. Reg.	Total Energy (kJ)	0.509	1.000	390.8	0.897
XGBoost	Max. Deformation (m)	0.820	0.362	0.0770	0.1038
XGBoost	Von Mises Stress (MPa)	−2.149	0.773	40.36	45.15
XGBoost	Total Energy (kJ)	0.423	0.967	423.4	85.58

¹ Cross-validation for SVR Total Energy (Lagrangian) did not converge; value omitted.

Table 11. Regression performance metrics for SPH dataset.

Model	Target	R² (test)	R² (CV 5f)	RMSE (test)	RMSE (CV 5f)
RF	Max. Deformation (m)	0.994	0.989	0.0041	0.0048
RF	Eq. Stress (MPa)	0.915	0.820	18.37	21.14
RF	Total Energy (kJ)	0.996	0.995	29.18	36.77
SVR	Max. Deformation (m)	−0.011	−0.005	0.0527	0.0587
SVR	Eq. Stress (MPa)	0.964	0.978	11.89	4.72
SVR	Total Energy (kJ)	0.966	0.970	88.11	99.73
Poly. Reg.	Max. Deformation (m)	0.265	0.995	0.0390	0.0033
Poly. Reg.	Eq. Stress (MPa)	−0.645	1.000	53.45	≈0
Poly. Reg.	Total Energy (kJ)	0.499	1.000	311.9	≈0
XGBoost	Max. Deformation (m)	0.256	0.980	0.0392	0.0065
XGBoost	Eq. Stress (MPa)	−1.492	0.966	65.79	11.21
XGBoost	Total Energy (kJ)	0.463	0.995	323.1	48.65

² Polynomial Regression cross-validation R² = 1.000 with RMSE ≈ 0 for SPH stress and energy targets reflects overfitting of the polynomial basis to the small training set (n = 40) and should not be interpreted as genuine predictive accuracy.

Table 12. Structured validation results for Lagrangian surrogate models — deformation and energy.

Validation Case	Bird Velocity (m/s)	Blade Speed (rad/s)	Scientific Rationale
Case A (Edge 1)	122.5	645	Minimum bird / maximum blade speed. Tests the model’s ability to handle high-divergence inputs.
Case B (Edge 2)	247.5	395	Maximum bird / minimum blade speed. Inverse of Case A; checks for symmetry in error distribution.
Case C (Center)	185	520	Pure interpolation. Verifies the model’s performance in the heart of the training range.
Case D (Extrap.)	260	660	Upper extrapolation. Tests model generalisation outside the training bounds.
Case E (Off-Diag)	210	450	Moderate asymmetry. Realistic scenario where the bird is fast but the engine is at lower power setting.

Table 13. Structured validation results for Lagrangian surrogate models — von Mises stress.

Case	Bird Vel. (m/s)	Blade Speed (rad/s)	ANSYS Def. (m)	ANSYS Stress (MPa)	Sim. Time (min)	ML Pred. Def. (m)	ML Pred. Stress (MPa)
Case A (Edge 1)	122.5	645	0.5261	1192.1	33	0.6288	1159.2
Case B (Edge 2)	247.5	395	0.7988	1183.2	25	0.6887	1168.2
Case C (Center)	185	520	0.9433	1165.2	22	0.8603	1170.2
Case D (Extrap.)	260	660	0.8683	1244.2	28	0.8304	1213.7
Case E (Off-Diag)	210	450	0.6787	1146.4	30	0.7024	1168.1

Table 14. Structured validation results for SPH surrogate models.

Case	Bird Vel. (m/s)	Blade Speed (rad/s)	ANSYS Def. (m)	ANSYS Stress (MPa)	Sim. Time (min)	ML Pred. Def. (m)	ML Pred. Stress (MPa)
Case A (Edge 1)	122.5	645	0.5203	1252.2	28	0.4158	1118.4
Case B (Edge 2)	247.5	395	0.3217	1187.6	30	0.3264	992.0
Case C (Center)	185	520	0.4193	1192.5	24	0.3923	1099.0
Case D (Extrap.)	260	660	0.5272	1237.8	31	0.4158	1118.4
Case E (Off-Diag)	210	450	0.3640	1165.7	27	0.3264	992.0

* Cases A and D, and Cases B and E yield identical ML predictions due to Random Forest leaf assignment within the training space. This is not a copy-paste error; it reflects the piecewise-constant nature of ensemble tree predictions at similar input coordinates.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.