3. Results and Discussion
3.1. Reliability-Oriented Hyperparameter Optimization and Model Performance
To obtain a stable training configuration for the proposed model under heterogeneous data conditions, hyperparameter optimization was conducted using Bayesian Optimization (BO) in combination with cross-validation (CV) [
18]. The objective of this procedure is to identify a reliable set of hyperparameters that minimizes validation error while maintaining stable model behavior across different data splits.
BO iteratively refines the hyperparameter configuration by modeling the validation error as a black-box function and selecting candidate parameter sets through an acquisition strategy. Compared with manual tuning or grid-based search, this approach enables more efficient exploration of the parameter space and provides a more reproducible model-selection process.
As shown in
Figure 6, the validation error exhibits clear trends with respect to the tested hyperparameters. Increasing the number of iterations reduces the validation error up to a certain point, after which the improvement becomes marginal, indicating convergence in model training. The learning rate also shows a limited effective range: excessively small values lead to underfitting, whereas overly large values result in unstable training and poorer validation performance.
The optimal tree depth is relatively low, suggesting that a simple model structure is sufficient to capture the dominant relationships in the data. Increasing the depth leads to higher validation error, indicating overfitting and increased sensitivity to noise. Similarly, larger values of the regularization coefficient (l2_leaf_reg) tend to increase the validation error, implying that excessive regularization may weaken the model’s ability to fit the data adequately.
These observations indicate that the prediction task favors a configuration with moderate model complexity and stable training behavior. Overall, the selected hyperparameter region is consistent with the need to balance fitting capacity and regularization in a heterogeneous prediction setting.
Table 4.
Optimal CatBoost hyperparameters obtained after tuning.
Table 4.
Optimal CatBoost hyperparameters obtained after tuning.
| Parameter |
Description |
Search Range |
Optimal Value |
| iterations |
Number of boosting iterations |
[40, 200] |
150 |
| learning rate |
Step size controlling update magnitude |
[0.01, 0.5] |
0.38 |
| depth |
Maximum depth of decision trees |
[2, 10] |
3 |
| l2_leaf_reg |
L2 regularization coefficient |
[0.01, 1] |
0.04 |
Based on the optimization results, the final hyperparameter configuration is set as iterations = 150, learning rate = 0.38, depth = 3, and l2_leaf_reg = 0.04. This configuration provides a practical balance between model capacity and regularization, leading to stable training behavior in subsequent experiments.
Importantly, hyperparameter optimization should be viewed as a supporting training procedure rather than the primary source of performance improvement. Its role is to provide a stable and consistent experimental configuration, whereas the comparative performance gains are evaluated in the following sections with respect to the proposed modeling formulation.
In summary, hyperparameter optimization serves as a supporting training procedure that helps establish a stable and reproducible model configuration, rather than being the primary source of performance gain.
3.2. Lithotype-Aware Characterization of Rock Mechanical Parameters
Following data preprocessing and model optimization, the proposed heterogeneity-aware residual formulation was applied to the prediction wells for continuous characterization of rock mechanical parameters.
Table 5 summarizes the performance across the training, validation, and test sets. The model achieves an
of 0.982 and a Mean Absolute Error (MAE) of 0.312 MPa on the training set, indicating strong fitting capacity. On the validation and test sets, the
values are 0.936 and 0.928, with MAE values of 0.594 and 0.641 MPa, respectively.
The consistency of these metrics across datasets indicates that the model maintains stable predictive behavior under cross-well conditions, without evident overfitting.
To further examine the effectiveness of the method under heterogeneous conditions, Well-Validate-1 and Well-Test-1 were selected as representative cases for comparison with conventional empirical approaches. Traditional methods assume lithological homogeneity and apply a single parametric relationship across the entire well interval. Under heterogeneous conditions, this assumption leads to systematic errors, particularly in lithological transition zones where identical logging responses may correspond to different mechanical properties.
In contrast, the proposed formulation separates geomechanical response into a global component and lithotype-dependent deviations. This enables the model to adapt its prediction according to lithological regime, capturing variations at lithological interfaces and reducing systematic bias in heterogeneous intervals.
The observed performance improvement should not be interpreted solely as an increase in prediction accuracy. Instead, it reflects a mitigation of the non-uniqueness in the mapping between logging responses and mechanical properties under heterogeneous conditions. By explicitly modeling lithotype-conditioned residuals, the method provides additional flexibility to distinguish cases that are difficult to separate under a single global mapping.
Furthermore, the improvement is not solely attributable to the inclusion of additional information. The same logging features are used in both conventional and proposed approaches; the difference lies in how the mapping is structured. The proposed method explicitly models lithology-related variations, whereas conventional methods implicitly average over such effects.
As a result, the method reduces lithotype-induced bias and improves the consistency between predicted mechanical parameters and underlying geological conditions. This provides a more reliable basis for geomechanical characterization in heterogeneous formations and suggests that the performance gain is associated with the proposed modeling formulation.
In summary, the results indicate that the proposed approach helps mitigate the non-uniqueness in geomechanical characterization under heterogeneous conditions, while maintaining stable predictive performance and physical consistency.
3.2.1. Case Study: Well-Validate-1
Well-Validate-1 was selected as a representative validation well to examine the behavior of the proposed method under heterogeneous lithological conditions. In the depth interval of 2700–2950 m, a relatively stable coal seam with a thickness of approximately 48 m is developed, interbedded with 4–6 parting layers totaling about 12 m. A total of 23 experimental measurements are available within this interval, covering all four lithotypes (bright, semi-bright, semi-dull, and dull coal), providing a suitable basis for evaluating both predictive accuracy and lithotype-dependent behavior. The predicted profiles, including HMLZ, lithotype classification, and mechanical parameters, are shown in
Figure 7.
The predicted UCS profile shows clear alignment with lithotype variations along the depth axis. In the bright coal interval (2750–2800 m), where HMLZ values indicate low-strength lithotypes, the predicted UCS remains within a low range (18–28 MPa), consistent with measured values and yielding a Mean Absolute Error (MAE) of 2.3 MPa. Local fluctuations within this interval are also captured, reflecting sensitivity to small-scale structural variations.
At the lithological transition near 2800 m, an increase in HMLZ corresponds to a corresponding rise in predicted UCS over a short depth interval. This transition is closely aligned with measured data, indicating that the model responds to lithotype changes rather than producing a smoothed global trend. In the dull coal section around 2820 m, the predicted UCS stabilizes within the high-strength range and remains consistent with experimental observations.
In contrast, the traditional empirical approach exhibits systematic distortion across lithotypes. In bright coal, it overestimates UCS due to bias toward higher-strength samples, while in dull coal it underestimates strength. More critically, the traditional prediction curve lacks sensitivity to lithological transitions, resulting in a smooth, low-frequency trend that fails to capture abrupt changes in mechanical behavior.
A lithotype-wise comparison reveals that errors in the traditional method are concentrated in extreme lithotypes, whereas the proposed method maintains consistently low errors across all categories. Across all 23 samples, the proposed method reduces the MAE from 9.2 MPa to 3.1 MPa and improves the coefficient of determination from 0.69 to 0.94.
These results should not be interpreted solely as an improvement in prediction accuracy. Instead, they indicate that the proposed formulation helps reduce lithotype-induced bias by explicitly modeling conditional deviations. The key distinction is that the method does not enforce a single global mapping, but adapts predictions according to lithological regime.
Further evidence of this mechanism can be observed in the behavior of other mechanical parameters. The elastic modulus decreases in low-strength lithotypes and increases in high-strength lithotypes, while Poisson’s ratio exhibits the opposite trend. These patterns are consistently captured by the proposed method but are systematically distorted in the traditional approach. The agreement across multiple parameters indicates that the model preserves physically consistent relationships rather than fitting isolated targets.
Overall, the case study demonstrates that lithotype-dependent deviations are not random noise but structured variations governed by geological conditions. By explicitly modeling these variations, the proposed method captures regime-dependent behavior and helps mitigate the non-uniqueness inherent in heterogeneous systems. This suggests that the performance gain is associated with the modeling of lithology-dependent variations.
3.2.2. Case Study: Well-Test-1
Well-Test-1 serves as an independent test well that was not involved in either the training or validation stages. It is therefore used to evaluate the generalization capability of the proposed heterogeneity-aware formulation under more complex geological conditions. Compared with Well-Validate-1, this well exhibits significantly stronger lithological heterogeneity. Within the 2760–2920 m interval, the HMLZ curve identifies 37 lithotype transitions, corresponding to an average spacing of approximately 4.3 m, indicating a high-frequency heterogeneous system. In addition, multiple thin parting layers are present in the 2800–2850 m interval, forming frequent interbedded contacts with coal seams. This configuration imposes stringent requirements on the model’s ability to resolve rapid lithological transitions and to maintain consistency across narrow depth intervals. A total of 18 experimental measurements are available, including nine points located within high-frequency transition zones, providing a demanding test scenario. The prediction results are shown in
Figure 8.
In the high-frequency transition interval (2800–2850 m), the proposed method shows clear responsiveness to lithotype variations. Multiple step-like changes in UCS are captured within short depth ranges, and the prediction curve remains synchronized with lithotype transitions indicated by the HMLZ profile. At lithological interfaces, the model produces rapid and localized adjustments in predicted strength, consistent with measured data. Thin parting layers are also correctly identified as high-strength zones, with predictions transitioning smoothly to adjacent coal intervals.
In contrast, the traditional empirical approach shows a loss of stability under these conditions. The prediction curve exhibits irregular oscillations that are not aligned with lithotype changes, and abrupt variations appear even in relatively stable intervals. This behavior is consistent with the limitation of enforcing a single global mapping, which cannot accommodate rapid regime changes.
Quantitative evaluation further highlights this difference. In the high-frequency transition zone, the proposed method maintains low prediction error and a high proportion of samples within a narrow error band, whereas the traditional method shows significantly larger deviations and reduced consistency. Across the entire well, the proposed method achieves substantially lower error and higher correlation with measured values, while maintaining similar performance levels to those observed in the validation well. The small difference in error between validation and test wells suggests that the learned mapping maintains stable predictive performance on unseen data.
These results should not be interpreted solely as accuracy improvement. Under high-frequency heterogeneity, the mapping from logging responses to mechanical properties becomes highly non-unique, and the residual component becomes more significant. The ability of the proposed method to maintain stable predictions suggests that the residual term captures lithotype-dependent variations rather than random fluctuations.
In particular, the alignment between predicted step changes and lithotype transitions suggests that the residual is not purely noise, but is related to lithological regime. This indicates that the decomposition into global and lithotype-conditioned components provides a reasonable representation of the underlying geomechanical behavior.
Overall, the comparison across validation and test wells shows that the proposed formulation remains stable under both moderate and high-frequency heterogeneity. The improvement in performance is therefore associated with the explicit modeling of lithological-regime-related variations, rather than with increased model complexity or additional features.
This suggests that the proposed approach helps mitigate the non-uniqueness in geomechanical characterization under heterogeneous conditions, achieving consistent cross-well generalization while preserving physical interpretability.
3.3. Ablation Study Results and Quantitative Analysis
To quantify the role of the lithotype-conditioned residual formulation, three modeling configurations were evaluated on the independent test well (Well-Test-1). The comparison focuses on uniaxial compressive strength (UCS) prediction and is summarized in
Table 6.
The baseline model, which relies solely on logging features, captures the overall variation trend but exhibits significant errors in intervals where similar logging responses correspond to different lithotypes. This suggests that a single global mapping may be insufficient under heterogeneous conditions.
Introducing lithotype as an additional input improves performance, indicating that lithological information provides useful constraints. However, the improvement remains limited, suggesting that treating lithotype as a conventional feature does not fully resolve the ambiguity in the mapping.
In contrast, the proposed formulation models lithotype-induced deviations as a separate residual component. This leads to a noticeable performance improvement, with increasing to 0.928 and MAE reduced by more than 60% compared to the baseline. More importantly, this improvement is achieved without introducing new information, but by restructuring the mapping itself.
The comparison between the feature-augmented model and the proposed formulation provides evidence that the residual is not purely random noise. If lithotype-dependent variations were unstructured, incorporating lithotype as a feature would be sufficient. However, the additional improvement achieved by the residual formulation indicates that these variations may follow a systematic pattern that cannot be captured within a single mapping.
This result suggests that lithotype is associated with condition-dependent deviations in geomechanical response. By explicitly modeling these deviations, the proposed method separates global trends from lithotype-specific corrections, thereby helping mitigate the non-uniqueness inherent in heterogeneous systems.
In this sense, the performance gain should be interpreted as a consequence of modeling conditional bias rather than improving predictive capacity. The results suggest that the relationship between logging responses and mechanical parameters may exhibit multi-regime characteristics, and that accurate characterization requires explicit representation of lithotype-dependent structure.
Overall, the ablation study provides quantitative evidence that the proposed approach differs from conventional formulations and is a structurally different formulation that addresses limitations associated with global mapping assumptions in heterogeneous geological environments.
3.4. Mechanism Analysis of Structured Residual Correction
To verify that the performance gain of the proposed framework arises from modeling structured lithotype-dependent deviations rather than from a purely empirical two-stage refinement, a dedicated residual analysis was conducted. This analysis examines the error distribution of the baseline global model and evaluates how the residual component systematically resolves heterogeneity-induced bias.
3.4.1. Identification of Structured Bias in the Global Baseline
A fundamental assumption of this study is that a single global mapping inevitably introduces systematic bias in heterogeneous formations because it “averages” the mechanical responses of different lithotypes. To test this, the Signed Mean Residual (SMR) and Standard Deviation (SD) of the baseline model were calculated for each coal lithotype in the test set.
As shown in
Table 7, the baseline residuals are not randomly distributed white noise; instead, they exhibit a clear polarity tied to the lithological regime. In low-strength bright coal intervals, the baseline model consistently overestimates UCS (SMR = +2.45 MPa), whereas in high-strength dull coal intervals, it tends to underestimate the strength (SMR = -3.12 MPa). This systematic departure suggests that the “ambiguity” mentioned in the introduction—where similar logging responses correspond to different mechanical properties—manifests as a structured bias in a unified predictor.
3.4.2. Amplification of Bias in Transition Zones
The structured bias is further intensified in lithological transition zones. We defined “Transition Zones” as intervals within 0.5 m of a lithotype boundary identified by the HMLZ index.
Figure 9 compares the residual density between stable intervals and transition zones.
In stable lithological intervals, the baseline model exhibits a relatively narrow error distribution. However, in transition zones, the residual variance increases by approximately 140%, and the distribution becomes markedly bimodal. This phenomenon indicates that near lithological interfaces, the global mapping fails to track the rapid shift in geomechanical response, even when the logging signals (e.g., AC or DEN) show only subtle variations. This is consistent with the conclusion that heterogeneity-induced ambiguity is a localized stress-point for traditional modeling approaches.
3.4.3. Effectiveness of the Lithotype-Conditioned Correction
The proposed method addresses this by explicitly modeling these structured errors through
.
Figure 10 illustrates the “flattening” effect of the residual correction. After incorporating the lithotype-conditioned component, the SMR for all lithotypes converged toward zero (e.g., dull coal SMR improved from -3.12 MPa to -0.18 MPa).
Crucially, the standard deviation of the residuals also decreased across all regimes, indicating that does not just shift the mean but also reduces the uncertainty within each lithotype. This transition from a “lithotype-biased” error to a “lithotype-neutral” error provides strong evidence that the performance gain is associated with the proposed formulation. The residual model successfully captures the conditional deviations induced by the HMLZ-defined regimes, improving the physical consistency of the characterization.
This observation indicates that the residual is not only structured, but also explicitly dependent on lithotype, confirming that lithological regimes act as conditioning variables governing systematic deviations in geomechanical response. This supports the assumption that the mapping from X to Y is multi-regime rather than globally unique.
In summary, the residual analysis demonstrates that: (1) baseline errors are geologically structured; (2) this structure is driven by lithological heterogeneity; and (3) the proposed decomposition captures this structure, transforming a biased global mapping into a lithotype-aware characterization framework.
3.5. Heterogeneity-Focused Evaluation in Transition Zones
To further examine the role of lithotype-induced heterogeneity, a focused evaluation was conducted on the 2800–2850 m interval of Well-Test-1, where bright coal and dull coal are frequently interbedded. This interval represents a typical high-frequency heterogeneous regime, in which the mapping between logging responses and mechanical properties becomes highly non-unique. The quantitative results are summarized in
Table 8.
The baseline model shows a pronounced degradation in performance within this interval, producing overly smoothed predictions that fail to capture sharp variations in mechanical properties across lithological boundaries. This behavior is consistent with the limitation of a single global mapping when applied to a multi-regime system.
Introducing lithotype as an additional feature improves sensitivity to lithological variation, but substantial errors remain. This indicates that while lithotype contains relevant information, embedding it within a unified mapping does not sufficiently resolve the ambiguity caused by heterogeneity.
In contrast, the proposed formulation maintains stable predictive performance in the transition zone. The improvement is particularly evident in the reduction of maximum error, suggesting that abrupt changes in mechanical properties are better captured. This behavior suggests that the model is able to adapt its response locally in accordance with lithological transitions.
More importantly, this interval provides additional evidence of the role of the residual component. In transition zones, the global mapping becomes insufficient and the residual term becomes more significant. The significant performance gap between the feature-augmented model and the proposed formulation indicates that lithotype-induced deviations are not purely random and require explicit modeling rather than implicit representation.
This observation supports the interpretation that heterogeneity introduces condition-dependent bias into the mapping. By modeling this bias as a lithotype-conditioned residual, the proposed method helps mitigate the ambiguity that arises in transition zones and maintains consistent predictive behavior.
Overall, the results demonstrate that the advantage of the proposed formulation is most pronounced in intervals where heterogeneity is strongest. This suggests that the method addresses limitations of global mapping approaches and provides a reliable characterization of mechanical properties in complex geological settings.
In this sense, transition zones serve as a critical test for evaluating whether the residual component captures structured variations. The consistent improvement observed in this interval indicates that the residual is not noise, but is related to lithological regime, providing support for the proposed heterogeneity-aware modeling framework.
3.6. SHAP-Based Interpretability Analysis
To examine whether the proposed heterogeneity-aware formulation captures physically meaningful and lithotype-consistent behavior under heterogeneous geological conditions, SHAP (SHapley Additive Explanations) was employed to analyze the relationships between logging responses, lithological regimes, and predicted mechanical parameters. In this study, SHAP is not used merely as a post hoc explanation tool, but as an analysis tool to examine model behavior.
At the global level, the mean absolute SHAP value was calculated for both the training and test sets to rank feature importance. As shown in
Table 9, acoustic transit time (AC) is the most influential feature in both datasets, followed by gamma ray (GR), density (DEN), and resistivity-related features. The overall consistency of the feature ranking between the training and test sets indicates that the model captures stable controlling factors across wells, rather than overfitting to specific local patterns.
This global ranking is physically interpretable. AC reflects the propagation behavior of acoustic waves and is closely associated with fracture development, pore structure, and structural integrity. DEN characterizes material compactness and bulk structural condition, while GR provides supplementary information related to compositional variability, ash content, and clay-related effects. Together, these features form a physically meaningful basis for geomechanical characterization.
To further examine the direction and distribution of feature effects, a SHAP summary plot was generated, as shown in
Figure 11. The summary plot presents the SHAP contribution of each sample together with the corresponding feature value, thereby revealing both the magnitude and polarity of feature influence. High AC values generally correspond to negative SHAP contributions, indicating a reduction in the predicted mechanical parameters, whereas low AC values tend to produce positive contributions. This agrees with rock physics expectations, since larger transit time is usually associated with poorer structural integrity and lower load-bearing capacity. In contrast, higher DEN values generally produce positive contributions, reflecting the higher strength expected in denser and more compact formations.
Unlike simple correlation analysis, SHAP can reveal conditional effects arising from nonlinear interactions and lithotype-dependent modulation. To assess whether lithological heterogeneity is explicitly reflected in the model behavior, the SHAP contributions associated with coal lithotypes were further examined. The analysis shows that even within similar AC or DEN intervals, SHAP values exhibit systematic offsets across lithotypes. Lower-strength lithotypes tend to produce more negative contributions, whereas higher-strength lithotypes more often produce positive contributions. This indicates that identical logging responses do not correspond to a unique mechanical implication, but are interpreted differently depending on lithological regime.
This result provides additional support for the proposed formulation. If lithotype-induced variation were merely random noise, samples with similar logging features would show similar SHAP contributions regardless of lithotype. Instead, the observed systematic separation suggests that the model captures regime-dependent behavior, which is consistent with the assumption that heterogeneity introduces condition-dependent variations into the mapping from logging data to mechanical properties.
To visualize this modulation more directly, local contribution analyses were performed for representative depth samples.
Figure 12 shows how the final prediction is decomposed into additive feature contributions for specific examples. For low-strength samples located in bright coal intervals, the prediction below the baseline is typically driven by the joint effect of high AC, low DEN, and lithotype-associated negative contributions. For high-strength samples in dull or semi-dull coal intervals, the opposite pattern is observed, with low AC, high DEN, and lithotype-associated positive contributions driving the prediction above the baseline. This decomposition provides a traceable explanation for why the predicted mechanical parameter at a given depth is high or low.
Taken together, the SHAP results provide consistent evidence from multiple perspectives: global importance ranking, direction of feature influence, lithotype-dependent modulation, and local sample-wise decomposition. More importantly, they support the central claim of this study that the performance gain does not arise from feature enrichment alone, but from the structural modeling of lithotype-conditioned deviations. The interpretability analysis therefore suggests that the proposed method captures physically meaningful and geologically consistent patterns, and that the residual component represents lithotype-dependent behavior rather than arbitrary correction.
In summary, the SHAP-based analysis demonstrates that the proposed method preserves physical consistency, reflects lithotype-controlled modulation of geomechanical response, and provides additional interpretability support for the heterogeneity-aware residual formulation. These findings reinforce the conclusion that the model helps mitigate heterogeneity-induced non-uniqueness through structured conditional modeling rather than through incremental adjustment of a single global predictor.