Preprint
Article

This version is not peer-reviewed.

Explainable AI for Financial Distress: Evidence from Market Volatility and Regime Dynamics

Submitted:

31 March 2026

Posted:

01 April 2026

You are already at the latest version

Abstract
This study investigates the role of market volatility, proxied by the CBOE Volatility Index (VIX), as a potential amplifier of corporate leverage risk within the S&P 100. Addressing the limitations of traditional financial distress models in capturing non-linear and regime-dependent dynamics, we employ XGBoost combined with SHAP-based explainable AI (XAI) on a longitudinal dataset spanning 2000-2025. The results show that total debt remains the dominant predictor of financial distress, while the influence of risk-related variables such as the VIX and equity returns increases during crises periods. Monetary policy indicators become more important during pandemic conditions, whereas inflation dominates in stable environment. This finding highlights the regime-dependent nature of financial risk drivers and demonstrates the value of explainable machine learning in developing interpretable early warning systems. By integrating predictive accuracy with interpretability, this study provides new insights into the interaction between firm-level leverage and external market volatility.
Keywords: 
;  ;  ;  ;  ;  ;  

1. Introduction

Corporate financial distress prediction has been a central topic in finance, with initial models primarily relying on linear relationships between accounting variables and financial distress. Contributions by Altman (1968) and Ohlson (1980) are seminal in this area, with subsequent research incorporating market information and hazard models (Shumway, 2001; Chava and Jarrow, 2004). Further recent studies have shown the importance of market information in corporate distress prediction, with macroeconomic factors being important in this area (Campbell et al., 2008; Bharath and Shumway, 2008). However, most traditional econometric models typically assume linearity and parameter stability, assumptions that are unlikely to hold in modern financial climate characterized by structural breaks, crisis episodes, and regime-dependent risk dynamics.
Drawing from this limitation, a growing literature suggests that financial distress is inherently non-linear and sensitive to macroeconomic regimes. Duffie et al. (2007) show that default intensities vary in accordance with the prevailing macroeconomic regimes, while Ang and Timmerman (2012) highlight the importance of regime shifts in financial markets. Similarly, Adrian et al. (2014) show how leverage cycles amplify systemic risk, and Giglio et al. (2016) demonstrate the effects of macroeconomic uncertainties on financial distress. Therefore, modelling financial distress requires approaches capable of capturing non-linear interactions and time-varying relationships. In this context, Acosta-González et al. (2019) argue that traditional econometric methods such as the Logit model face structural challenges in dealing with high-dimensional data sets, including problems related to redundancy and the ability to detect sudden changes in financial risk. These limitations motivate the transition toward flexible, data-driven modelling frameworks.
In response, machine learning approaches extend classical models by allowing flexible, non-linear relationships in financial distress prediction. Kernel-based methods (Hui & Sun, 2006) and more recent ensemble techniques such as XGBoost (Huang & Yen, 2019; Liu, 2023) have demonstrated promising performance in identifying hidden distress signals and operational vulnerabilities compared with traditional linear models. Empirical evidence from Lessmann et al. (2015) and Barboza et al. (2017) shows that non-linear algorithms consistently outperform conventional approaches in credit risk and financial distress prediction tasks. Similarly, Gu et al. (2020) document that machine learning techniques are effective in capturing complex interactions within high-dimensional financial data. Among these methods, ensemble approaches—particularly gradient boosting algorithms such as XGBoost—have gained considerable popularity due to their predictive accuracy and robustness. However, their “black box” nature limits economic interpretability, restricting their usefulness for policy, restricting h and may hinder their usefulness policy, regulatory, and managerial decision-making.
The importance of interpretability becomes even more pronounced when considering that the drivers of financial distress are not stable over time. An increasing body of research has highlighted the regime-dependent nature of financial risk factors. For example, Choi (2024) and Cheong & Hoang (2021) document changes to the order of importance of these determinants, indicating that micro-level determinants such as profitability dominate during more stable economic regimes. Conversely, more macro-level determinants such as inflation and credit conditions dominate during more financially distressing times. This is further complicated by other channels of financial distress transmission such as those experienced through health shocks (Tanaka et al., 2025) and through global trade networks during times of geopolitical instability (Zhang et al., 2025). Despite these advances, literature has yet to fully examine how market-wide instability interacts with firm-level structural vulnerabilities. Therefore, moving beyond predictive accuracy toward explainable artificial intelligence (XAI) frameworks is essential for identifying not only whether distress occurs, but also why its drivers evolve across economic regimes.
One important yet understudied channel of such uncertainty is market volatility. The VIX index is widely viewed as a barometer of investor uncertainty and risk aversion (Whaley, 2000; Bloom, 2009). It has been shown to affect investment decisions, and asset pricing (Bekaert et al., 2013; Bali et al., 2017). However, it remains unclear whether heightened market uncertainty amplifies structural vulnerabilities in corporate balance sheets and, consequently, increases the probability of financial distress.
To address these gaps, this paper develops a XAI framework to examine regime-dependent financial distress dynamics for firms in the S&P100 index. This perspective allows for explicit investigation effects often obscured in conventional estimations, extending the work of Kalash (2023) and Nowicki et al. (2024). Specifically, this study integrates XGBoost with SHAP values to test the hypothesis that market volatility, proxied by the VIX index, acts as a multiplier on leverage-related financial distress risk. By focusing on large-cap firms, the paper provides system-level insights into the interaction between macroeconomic uncertainty and firm-level vulnerabilities in developed markets.
This paper contributes to literature in four ways. Firstly, it examines the relationship between market volatility and corporate leverage in a non-linear predictive model. Secondly, this paper proposes a novel approach to explainable AI in financial distress prediction. Thirdly, this paper examines the regime dependence of financial distress risk factors. Finally, this paper contributes to the literature by proposing a novel empirical analysis on large-cap firms in developed markets. These contributions collectively advance understanding of how macroeconomic uncertainty interacts with firm-level financial fragility.
Overall, the study argues that corporate survival in an interconnected economy depends on complex non-linear interactions between internal balance sheet dynamics and external market volatility. By combining predictive modelling with explainability techniques, the proposed framework provides novel insights into how macro-level uncertainty impacts micro-level financial fragility. The rest of the paper is outlined as follows: Section 2 provides a review of the literature. Section 3 describes the data and methodology. Section 4 provides empirical results. Section 5 concludes the paper.

2. Literature Review and Hypotheses Development

2.1. Non-Linearity in Financial Stability: A Microstructure Perspective

The theoretical foundation of financial distress prediction has progressively shifted from a purely linear and deterministic perspective toward an understanding of complex, non-linear dynamics. Classical corporate-finance models assumed a smooth and gradual erosion of solvency, as illustrated by the structural model of Merton, 1974. However, recent studies demonstrate that default risk often materializes through abrupt regime-shifts or tipping points (Liu et al., 2025). In other words, a firm’s health does not follow a predictable straight line; instead, once debt levels, market volatility, or macro-financial stress cross critical thresholds, the probability of distress can rise exponentially (Yang et al., 2025).
Within market microstructure, these non-linearities are further magnified by liquidity frictions and the rapid feedback loop between high-frequency trading signals and fundamental accounting information. Empirical work shows that liquidity-driven price dislocations and ultra-fast order-flow dynamics can trigger sudden spikes in default risk that traditional structural models cannot capture (Kirilenko et al, 2017). Consequently, modern predictive frameworks increasingly adopt non-linear modelling approaches, including regime-switching models, deep-learning architectures (Mienye et al., 2024), and graph-neural-network (Tiwari, 2025) which are designed to capture multi-dimensional risk interdependencies (Wang & Jin, 2025)
Empirically, the shift toward non-linear modelling began with recognition of the limitations inherent in traditional frameworks. Acosta-Gonzálezet al. (2019), while achieving success with Logit models in the Spanish construction sector, underscored persistent challenges related to multicollinearity and variable redundancy, which obscure predictive signals in high-dimensional environments. Similarly, early comparative research conducted by Hui and Sun (2006) demonstrated that Support Vector Machines (SVM) outperform multivariate discriminant analysis (MDA) by allowing flexible decision boundaries. These findings emphasis the need for models that do not impose constant relationship between predictors and outcomes.
The evolution toward ensemble learning represents a significant advancement in addressing these complexities. Huang and Yen (2019) established through a comprehensive study of public-listed firms that XGBoost consistently outperforms traditional classifiers by effectively partitioning non-linear feature spaces. This capacity to handle high-dimensional, complex data is vital for detecting what traditional models treat as noise. For instance, Yi Liu (2023) finds that XGBoost improves diagnostic precision in financial fraud prediction by capturing interaction effects among governance and accounting features, which classical models fail to detect.
Further evidence supports the presence of non-monotonic and interaction-driven relationships. Wang et al. (2025) identify U-shaped relationships between corporate social responsibility and financialization, while Ergenç and Aktaş (2025) demonstrate, using SHAP interaction values, that the influence of liabilities depends on macroeconomic conditions and clustered market volatility. phenomena that are mathematically invisible to standard linear frameworks. This findings confirm that financial distress is the result of non-monotone interactions rather than simple additive effects.
In contemporary markets where information asymmetry is prevalent, the robustness of non-linear models within an XAI framework offers a transformative diagnostic advantage. Tran et al. (2022) identify influential non-linear predictors in emerging markets using SHAP values, while Qi (2025) applies XGBoost to high-risk SME environments. Chan et al. (2025) report strong interpretative consistency across tree-based models, and Chan et al. (2026) demonstrate temporal stability of identified predictors across macroeconomic regimes. Together, these studies highlight the diagnostic advantages of combing non-linear modelling with explainability.
Despite these advancements, limited research integrated market microstructure signals and volatility dynamics in large-cap global indices such as the S&P 100. Understanding how these forces interact to trigger abrupt distress require and XAI-driven framework capable of identifying hidden risk patterns and threshold-driven transitions. Based on the theoretical premise of non-linear risk thresholds and the empirical capacity of ensemble learning, the following hypothesis is proposed:
Hypothesis (H1): 
Non-linear machine learning architectures, within an Explainable AI (XAI) framework, possess the capacity to accurately identify corporate financial stress. 

2.2. Temporal Dynamics and Regime-Switching in Financial Risk Drivers

The theoretical framework for understanding financial distress has shifted from static observations to a dynamic, regime-dependent perspective (Yang et al., 2025). This perspective is rooted in the Financial Instability Hypothesis proposed by Minsky (1986), which argues that determinants of corporate health are not invariant but are conditioned by the broader economic cycle. During stable periods, financial risk is largely idiosyncratic, and firm-specific variables dominate. Traditional theories such as the Trade-off theory and Pecking Order framework (Myers & Majluf, 1984) emphasize leverage and liquidity decisions as primary determinants of survival. However, during systemic shocks, macroeconomic forces increasingly dominate firm-level fundamentals (Pradhan et al., 2025).
This regime shift implies that external conditions such as inflation, interest rate, and geopolitical tensions influence borrowing capacity and firm survival (Berninger et al, 2021). Liquidity constraints further heighten these dynamics, as highlighted by the banking fragility framework of Diamond & Dybvig (1983). Consequently, the determinants of financial distress vary across economic cycles.
Empirically, the role of macroeconomic factors as catalysts for capital structure changes is well-documented. Sahin (2018) utilized dynamic panel data to reveal that inflation and lagged debt ratios significantly influence corporate financial decisions, suggesting that internal firm structures are always tethered to the broader macro-environment. This volatility is further reflected in the work of Ibrahimov et al. (2025), which demonstrates that macroeconomic stress indices exert a pervasive negative effect on firm profitability, often overwhelming sector-specific advantages.
The distinction between stability and crisis is further illuminated by studies on firm resilience. Lee et al. (2017) noted that while internal attributes like firm age provided a buffer during the Global Financial Crisis, their effectiveness was heavily dependent on the prevailing market regime. Similarly, Cheong and Hoang (2021) showed that the impact of GDP growth and inflation on profitability varies significantly between stable phases and crisis periods. This regime-dependency is echoed in the work of Choi et al. (2024), which provides direct evidence that capital structure adjustments are primarily driven by internal factors during stable times, while macroeconomic credit premiums become the dominant force during crises.
Modern crises introduce additional complexity. Tanaka et al. (2025) demonstrate that bankruptcy drivers during the COVID-19 pandemic differed from traditional crises, highlighting how specific shocks redefine the hierarchy of risk predictors. This complexity is compounded by global trade interdependencies. Zhang et al. (2025) show that geopolitical shocks transmit risk through international trade networks, where external trade centrality becomes a critical survival factor. To capture these shifts, Yan et al. (2020) argue for multi-period lagged systems that integrate both micro and macro windows. Meanwhile, Labosova et al. (2025) acknowledge that ignoring the macroeconomic context during periods of distress limits the predictive power of firm-level ratios.
Collectively, these findings suggest that the relative importance of financial distress predictors changes across economic regimes. By integrating the emerging theory of regime-dependency with the empirical evidence of variable-weight fluctuations across economic cycles, the following hypothesis is proposed:
Hypothesis (H2): 
The prioritization of factors affecting financial health is regime dependent. 

2.3. The Interaction Effects of Market Sentiment and Structural Leverage

The interaction between market segment and corporate leverage is grounded in the Financial Accelerator Principle (Bernanke, Gertler, & Gilchrist, 1999). This theory suggests that adverse shocks heighten financial constraints through balance-sheet effect. Similarly, agency theory (Jensen & Meckling, 1976) and the debt-overhang literature emphasis that the riskiness of leverage depends on market conditions. Thus, leverage becomes more harmful during periods of uncertainty.
The pecking Order Theory (Myers & Majluf, 1984) and Market Timing (Baker & Wurgler, 2002) further argue that firms adjust financing decisions in response to market sentiment. The interaction between a firm’s internal capital structure and external market conditions creates a feedback loop, whereby financing choices become sensitive to investor expectations. Consequently, during periods of high uncertainty, debt overhang becomes a more severe catalyst for corporate failure (Blickle & Santos, 2024). In this context, market-wide volatility may increase the adverse effects of leverage on firm stability and increase financial distress risk (Geanakoplos, 2024; Pradhan et al., 2025).
Empirical studies support the role of sentiment as a bridge between firm behavior and the macroeconomy. Chen et al. (2021) show that market sentiment bidirectionally connects interest rates and firm investments, suggesting that psychological factors are intrinsic to financial health. When sentiment turns negative — especially during prolonged downturns — Kalantonis et al. (2021) finds that negative sentiment reduces leverage capacity. Kalash (2023) provides direct evidence that the negative impact of financial leverage on performance is significantly exacerbated by distress risk and currency crises, highlighting a clear moderating effect of external shocks on internal structural variables.
The intensification of risk via volatility is a common thread in recent empirical research. Caporale et al. (2024) and Karanasos et al. (2022) find that global credit conditions and VIX significantly contribute to market volatility, especially when driven by infectious disease outbreaks and/or economic policy uncertainty. This volatility has a direct link to corporate credit capacity. To illustrate this point, Baum et al. (2010) find that variations in leverage are subject to moderation by macroeconomic uncertainty and corporate governance, suggesting that uncertainty impacts how corporate leverage varies. Another example is provided by Homapour et al. (2022), who find that leverage is counter-cyclical and highly responsive to financial market risk.
It has also been recognized that predictive models have come to realize that macro-level variables may have more predictive validity than firm-level ratios during uncertain times. Acosta-González et al. (2019) have shown that macro-level variables, such as credit availability and market volatility, have more predictive validity than internal financial ratios for corporate failure. This relationship is further exemplified in Nowicki et al. (2024), where debt level is shown to play a moderating role in the relationship between macro-level variables such as GDP growth rate and inflation with liquidity, where the capital structure of a firm moderates the relationship. From a theoretical basis, Cacciari (2005) argues that globalization dissolves traditional boundaries while creating new systemic pressures; therefore, a firm’s internal leverage is constantly being influenced by external market volatility.
As sentiment turns negative, especially during protracted downturns, Kalantonis et al. (2021) note that economic sentiment indicators are subject to downward pressure on leverage strategies. This supports the proposition regarding external sentiment conditions’ influence on the effectiveness of internal debt decisions, as discussed under Hypothesis 3.
Hypothesis (H3): 
Market volatility (VIX) interacts with corporate leverage such that high volatility amplifies the adverse effect of debt on financial stability, increasing the probability of financial distress. 

3. Methodology

The analytical framework of this study is established as a progression from structural credit risk estimation to advance non-linear predictive modelling. The data collection process utilizes a longitudinal dataset of the S&P 100 index constituents over a twenty-five-year horizon, spanning from 2000 to 2025. This extensive period is selected to encompass diverse economic regimes, including the Global Financial Crisis (2008), the COVID-19 pandemic shock (2020), and the subsequent inflationary and geopolitical tensions of 2022–2025. All market and financial data, including stock prices of S&P100, debt levels, capital markets of firms and macroeconomic indicators, were retrieved from the Yahoo Finance database. To address the dynamic nature of the index—where companies are frequently added or removed—this study constructs an unbalanced panel data structure. This approach is essential for maintaining the authenticity of the financial landscape, as it incorporates the inherent asymmetry of corporate reporting and the non-synchronicity of market entries and exists across different time frames. The explanation of variables is shown in Table 1.
The procedures of phases are explained as follows:
  • Phase I: Structural Estimation of Financial Health 
The first stage calculates the Distance to Default (DD) using the Merton (1974) structural model. Since the market value and volatility of a firm’s assets are unobservable, they are estimated by solving a system of non-linear equations using an iterative solver. In the first equation, equity is treated as a call option on the firm’s assets:
E = V A Φ d 1 e r T D Φ d 2
where E is the market value of equity, V A is the unobservable market value of assets, D is the face value of debt, r is the risk-free rate, T is the time to maturity and Φ is the cumulative standard normal distribution.
The second equation links equity volatility and asset volatility:
σ E = V A E Φ d 1 σ A
where σ E is the observed daily volatility of equity derived from 252-day rolling returns and σ A is the unobservable asset volatility. The term d 1 , representing the normalized distance between asset value and discounted debt, is given by:
d 1 = ln V A / D + r + 0.5 σ A 2 T σ A T
The final target variable, D D , representing the number of standard deviations the firm’s asset value lies above the default point, is defined as:
D D = d 1 σ A T
  • Phase II: Predictive Modelling via Ensemble Learning 
The second stage employs ensemble learning method - Random Forest (RF) and eXtreme Gradient Boosting (XGBoost), to model the non-linear relationships within the panel dataset.
The Random Forest model operates by training multiple independent decision trees through bagging. The final prediction for D D is the average of the outputs from all K trees (Breiman, 2001):
y ^ = 1 K k = 1 K f k x
where y ^ is the predicted Distance to Default, K is the number of decision trees (set to 100 in this study), x is the input feature vector (Debt, VIX, etc.), and f k x is the output of the k -th tree.
XGBoost utilizes a sequential boosting process where each new tree corrects the errors of the previous ones. The prediction at step t is the sum of the previous prediction and the output of the new tree (Chen & Guestrin, 2016):
y i ^ t = y i ^ t 1 + f t x i
where y i ^ t is the prediction for observation i at iteration t , and f t x i is the new learner added to the ensemble. To optimize performance, XGBoost minimizes an objective function using a second-order Taylor expansion to approximate the loss function (Chen & Guestrin, 2016):
O b j t i = 1 n l y i , y i ^ t 1 + g i f t x i + 1 2 h i f t 2 x i + Ω f t
where g i and h i are the first and second-order gradients of the loss function, and Ω f t is the regularization term to control model complexity.
  • Phase III: Diagnostic Evaluation and Explainable AI (XAI) 
Statistical validity is ensured through several diagnostic tests. Model performance is evaluated using R 2 , MAE, and RMSE. To ensure residual integrity, the Durbin-Watson test is conducted for autocorrelation, while Jarque-Bera and Kolmogorov-Smirnov tests assess the normality of error distribution. Furthermore, a Variance Inflation Factor (VIF) analysis is performed to ensure that multicollinearity among macroeconomic features does not cloud the results.
The final stage utilizes the SHAP (SHapley Additive exPlanations) framework (Lundberg & Lee, 2017) to decompose the model’s output across different stress regimes like the 2008 GFC and 2020 COVID shock. The SHAP values are calculated based on the game-theoretic formula:
ϕ i = S { x 1 , , x p } { x i } S ! p S 1 ! p ! f S { x i } f S
where ϕ i is the contribution of feature i , S is a subset of features not including i , and p is the total number of features. By applying these formulas to the dataset, the study identifies the non-linear thresholds where market volatility amplifies corporate debt burdens.

4. Results

The analysis begins by validating the structural credit risk parameters derived from the Merton framework. Table 2 reports diagnostic results based on more than half a million daily observations for the S&P 100 index. The model achieves a 100% convergence rate, indicating the numerical robustness of the iterative solver when reconciling observable equity values with unobservable asset structures. The mean DD of 10.849 paired with a remarkably low average Probability of Default (PD) of 0.27%, indicates that the constituent firms—representing the upper tier of US blue-chip corporate strength—maintained substantial capital buffer over the 2000–2025 period. This low average PD is consistent with the theoretical argument that, despite being sensitive to market volatility, large-cap firms typically display structural resilience that keeps default probabilities low outside crisis periods.
The correlations in Table 2 provide clear empirical support for the study’s structural hypotheses. The negative correlation between total debt and D D (-0.2679) confirms the leverage hypothesis: as corporate liabilities increase, the distance to the insolvency threshold narrows, effectively elevating systemic vulnerability. Conversely, the positive correlation between market capitalization and D D (0.0796) validates the size-resilience hypothesis (Li & Zhang, 2023), suggesting that larger firms benefit from a more substantial equity cushions that dampens the effect of asset volatility. Although average financial health remains robust, the model identification of 165 instances of insolvent observations—specifically during peak crisis windows—demonstrates that even the most capitalized firms are susceptible to tail-risk events when asset volatility exceeds historical norms.
Building upon these diagnostics, Figure 1 tracks the aggregate evolution of financial health, revealing clear patterns of regime-dependent fragility. The longitudinal trend illustrates that the S&P 100’s health is not a static property but a dynamic response to the macroeconomic environment. The steep decline in the mean DD during the 2008 Global Financial Crisis, as illustrated in the time-series plot, represents the most severe erosion of corporate stability within the sample. During this period, the index plummeted to below six standard deviations from the default point. This pattern aligns with the systemic contagion arguments of Campbell et al. (2008), who contend that, in crisis periods, the correlation between market-wide volatility and idiosyncratic firm health approaches unity.
While the 2020 COVID-19 shock induced a vertical and historically sharp decline in DD, the plot in Figure 1 reveals a rapid rebound facilitated by aggressive monetary policy interventions. However, the trajectory from 2022 to 2025 indicates a more complex and sustained pressure on corporate health. Unlike the V-shaped recovery of 2020, the recent period exhibits a persistent downward bias and increased volatility, likely driven by the inflationary environment and the subsequent rising interest rate cycle. This observation suggests that the high-inflation, high-rate regime of the mid-2020s has fundamentally altered the risk profile of the S&P 100, forcing the distance to default into a lower equilibrium than was seen during the era of quantitative easing. These findings underscore the necessity of ensemble learning methods to untangle the non-linear drivers of this modern financial stress.
The empirical analysis proceeds by evaluating predictive modelling performance using ensemble learning architectures as shown in Table 3, which presents a comprehensive comparative summary of the predictive accuracy and diagnostic integrity for both the Random Forest and XGBoost models. The results demonstrate that while both architectures exhibit high explanatory power, the XGBoost model consistently outperforms the Random Forest regressor in capturing the complexities of corporate financial health. Specifically, XGBoost achieved a test-set R-squared ( R 2 ) of 0.918, significantly higher than the 0.865 recorded for the Random Forest. This performance gap is further reflected in the error metrics; XGBoost reduced the RMSE to 1.906 compared to 2.449 in the Random Forest model, representing a 22.16% improvement in forecasting precision.
These results provide the primary empirical support for Hypothesis 1 (H1), which posits that non-linear machine learning architectures possess a superior capacity to accurately identify corporate financial stress. The significant reduction in RMSE and the higher variance explained ( R 2 ) of XGBoost suggest that the model effectively captures the complex interdependence and volatility-driven risk transitions which traditional linear or simpler bagging models might overlook.
The predictive reliability of the Random Forest model is visualized in Figure 2, which displays the alignment of actual versus predicted Distance to Default (DD) values. The plot shows a strong concentration of data points along the 45-degree identity line, confirming a high degree of fit for this architecture. Further, the financial interpretation of the feature importance ranking in Figure 2 provides definitive support for Hypothesis 2 (H2) regarding factor prioritization. As expected, Total Debt emerged as the dominant predictor with an important score of approximately 0.70.
This confirms the first part of H2, stating that internal firm-specific factors serve as primary determinants of stress. However, the significant weighting of Inflation (0.14) and VIX (0.07) in the global importance plot validates the Macro-Sensitivity component of H2. This suggests that during the crisis-heavy regimes within the 2000–2025 sample (notably the 2008 GFC, 2020 pandemic, and the 2022–2025 inflationary period), exogenous factors gain substantial predictive weight, directly challenging the notion that corporate default is purely an idiosyncratic event. The Durbin-Watson statistic of 1.998 for XGBoost (and 1.984 for RF) confirms the independence of errors, satisfying the rigorous requirements for high-frequency panel data analysis and reinforcing the validity of the findings for all three hypotheses.
The superiority of the XGBoost model over Random Forest provides further empirical evidence for Hypothesis 3 (H3), concerning the interaction effect between market volatility (VIX) and corporate leverage. Because XGBoost utilizes a gradient boosting framework that thrives on capturing residual errors through feature interactions, its higher accuracy during high-volatility windows implies that it is successfully modeling how the VIX exacerbates the destructive impact of debt. This confirms H3’s premise that high levels of market volatility increase the probability of distress more severely than in periods of stability.
The diagnostic integrity of the XGBoost model is further validated by the residual analysis presented in Figure 3. The Residuals vs Predicted plot exhibits a balanced distribution around the zero-axis, indicating that the model maintains homoscedasticity across the majority of the DD range. Additionally, the Histogram of Residuals reveals a leptokurtic, cantered distribution, confirming that the model is unbiased.
To ensure the robustness of the predictive models and provide a granular test of the research hypotheses, the dataset was partitioned into four distinct economic regimes: the Global Financial Crisis (GFC 2008), the COVID-19 Shock (2020), the Ukraine War/High Inflation Period (2022), and a Baseline Normal Period. Table 4 presents the performance metrics and SHAP-based feature influences for each regime. The model demonstrates strong overall fit, with R2 values consistently above 0.86, indicating that the majority of the variance in the target variable is explained by the selected predictors. Prediction errors are modest—MAE never exceeds 1.40 and RMSE stays below 2.0—showing reliable forecasting performance across the sample. Diagnostic statistics reveal essentially uncorrelated residuals (Durbin-Watson values around 0.03–0.06), a slight right-skew (skewness 0.13–0.57) and heavy tails (kurtosis 2.74–7.52), and the Jarque-Bera test rejects normality.
The results in Table 4 provide distinct evidence for H2, where Total Debt consistently maintains the highest SHAP influence while the relative weight of exogenous factors shifts during crises. During the GFC (2008), the SHAP influence of the VIX surged to 1.5008, more than double its Normal Period influence. During the Global Financial Crisis of 2008, the model becomes especially sensitive to market risk; the VIX and the S&P 500 exhibit the largest SHAP values (≈1.50 and ≈1.73 respectively), and total debt rises to its peak (≈3.09), highlighting the heightened relevance of leverage and equity volatility in a credit-tight environment. As shown in the Feature Influence Global Financial Crisis (2008) SHAP plot in Figure 4, high VIX values are associated with significantly higher positive impacts on the distress model output, aligning with Karanasos et al. (2022) that VIX conditions significantly inflate equity volatility during crises.
In the Ukraine War/High Inflation (2022) regime, the SHAP value for Inflation remained a key driver. The corresponding Feature Influence, Ukraine War and High Inflation (2022) SHAP plot Figure 5 illustrates how CPI interactions became a specialized stress environment, supporting findings by Nowicki et al. (2024) and Sahin (2018), who identify inflation as a significant moderator of liquidity.
Hypothesis 3, which posits that market volatility exacerbates the destructive impact of debt, is numerically and visually supported across the crisis partitions. In the GFC 2008 regime, Total Debt reached its highest SHAP influence of 3.0888, suggesting that when the VIX is at crisis levels, the model identifies a heightened sensitivity to leverage, consistent with Kalash (2023), who found that financial leverage’s negative effect is exacerbated by distress risk.
The COVID-19 pandemic period shows the 10-year yield emerging as the most influential factor (≈0.88), surpassing both debt and equity measures, which reflects the pivotal role of monetary policy signalling and yield-curve dynamics when liquidity and fiscal responses dominate market behaviour. The Feature Influence COVID plot in Figure 6 further highlights this interaction, where high debt levels show an elongated tail toward higher SHAP values, indicating extreme risk during systemic lockdowns.
Figure 5. Feature influence in COVID Period.
Figure 5. Feature influence in COVID Period.
Preprints 206008 g006
In the normal, non-crisis interval, inflation attains its highest contribution (≈1.23) and total debt remains substantial (≈2.59), whereas the influence of volatility and equity market indices diminishes (VIX ≈0.69, S&P 500 ≈0.44). The Feature Influence of Normal/Stable Period SHAP plot in Figure 6 shows a more dispersed influence of macroeconomic shocks, where internal firm metrics dominate the prediction as expected in stable market conditions.
Figure 6. Feature influence in Normal Period.
Figure 6. Feature influence in Normal Period.
Preprints 206008 g007
These findings suggest that, although total debt is a consistently strong predictor, the relative weight of risk-related variables such as the VIX and equity returns spikes during crises, whereas monetary-policy indicators become paramount in pandemic conditions and inflation dominates in tranquil periods. Incorporating regime-specific weighting or interaction terms in predictive models could therefore improve accuracy and provide clearer guidance for investors and policymakers navigating varying economic landscapes.
Diagnostic integrity is further reinforced by the statistics in Table 5. The Breusch-Pagan Test yielded an F-statistic of 2200.77 ( p < 0.05 ), confirming the presence of heteroscedasticity. In the context of Explainable AI (XAI), this confirms that the riskiness of a given level of debt is not static but fluctuates significantly with the market fear gauge. Moreover, the skewness of 0.276 and kurtosis of 7.522 indicate a leptokurtic distribution of residuals, confirming the model’s robustness to outliers. This diagnostic depth is visually represented in the Residuals vs. Predicted plot and Histogram of Residuals in Figure 3, which confirm that the model is unbiased and statistically sound for high-impact publication.
The regime-specific analysis demonstrates that the XGBoost model is contextually aware, correctly identifying Total Debt as the primary anchor of risk while scaling the influence of VIX and CPI during periods of instability. Compared with Tanaka et al. (2025), who noted distinct bankruptcy mechanisms for COVID-19, our model provides a unified framework that captures these differences through SHAP-based feature importance shifts. This confirms that modern financial stress is a multi-dimensional phenomenon governed by non-linear interaction effects that traditional models fail to quantify.

5. Conclusions

The present research successfully establishes a robust financial early warning system (EWS) for the S&P 100 by moving beyond traditional linear paradigms to capture the complex, non-linear interactions between corporate capital structure and the broader macroeconomic environment. By integrating the XGBoost algorithm with SHAP interpretability, the study overcomes the historical black-box limitations of machine learning in finance, offering a granular perspective on how financial risk drivers evolve across distinct economic regimes.
The high predictive accuracy achieved throughout this analysis addresses several critical limitations identified in classical financial distress literature. Traditional models, such as the Logit and Discriminant Analysis frameworks employed by Hui & Sun (2006) and Acosta-González et al. (2019), often fail to account for the intricate dependencies between financial leverage and systemic market volatility. Unlike the static accounting-based approaches, characterized by the work of Labosova et al. (2025), which focused exclusively on internal firm ratios, the findings of this study demonstrate that feature importance is inherently regime-dependent. This aligns with the observations of Tanaka et al. (2025) regarding the unique mechanisms of various crises; however, this research provides a more unified framework by showing how macroeconomic factors like VIX and interest rates modulate the model’s sensitivity to total debt in real-time. Such an interactive approach effectively bridges the gap observed in localized studies by Nowicki et al. (2024) and Ibrahimov et al. (2025), which frequently overlook the significant spillover effects of global market dynamics on firm-level risk.
From a theoretical standpoint, the dominance of Total Debt as the primary anchor of risk across all examined regimes provides substantial empirical support for the Pecking Order Theory. As discussed by Kalantonis et al. (2021) and Homapour et al. (2022), debt constitutes a structural liability during market contractions, but the SHAP analysis herein further quantifies this by revealing that the destructive impact of leverage is significantly magnified during periods of high market turbulence. This reinforces the distress-risk moderation hypothesis suggested by Kalash (2023) and suggests that a firm’s debt-bearing capacity responds asymmetrically to inflationary shocks and interest rate shifts, as noted in the context of emerging markets by Sahin (2018). Consequently, the confirmation of heteroscedasticity within the model necessitates a critical re-evaluation of linear business cycle models, such as those by Jermann & Quadrini (2009), which may underestimate the multi-dimensional nature of financial distress.
These findings translate into several practical implications for corporate and economic policy. The shifting role of risk drivers during crises necessitates that EWS frameworks move beyond static parameters to incorporate real-time market fear gauges and inflationary indices. While Total Debt remains the most consistent predictor, its weight is significantly moderated by macroeconomic conditions, particularly the 10-year interest rate, mirroring the interaction between uncertainty and capital structure identified by Baum et al. (2010). Furthermore, the susceptibility of developed markets to systemic spillovers remains consistent with the fundamental drivers of volatility analyzed by Caporale et al. (2024) and Karanasos et al. (2022). Therefore, it is recommended that corporate treasurers implement dynamic leverage management strategies where debt thresholds are adjusted based on forward-looking market volatility indices. Simultaneously, credit rating agencies and financial institutions should adopt regime-aware risk models that adjust feature weights according to the prevailing economic climate, while developing specialized inflationary hedging instruments for firms identified as highly sensitive to post-2022 economic shifts.
Looking toward the future of the field, there are several paths for expanding this framework. Future research could benefit from the integration of geopolitical risk variables and trade network centrality, as inspired by Zhang et al. (2025), to assess the resilience of global supply chains against localized financial distress. Synthesizing these quantitative metrics with sentiment analysis derived from financial news and managerial reports could further enhance short-term predictive power, building upon the foundational logic of Chen et al. (2021). Additionally, investigating the nexus between Environmental, Social, and Governance (ESG) scores and distress risk across different economic cycles, using optimization frameworks similar to those in Wang et al. (2025) would add a layer of modern sustainability linked risk assessment. Finally, conducting sector-specific analyses to compare the sensitivity of diverse industries to shifts in economic regimes and interest rate trajectories, as suggested by the work of Ergenç & Aktaş (2025), would provide the necessary nuance for targeted regulatory oversight.
In conclusion, this research demonstrates that in the modern financial landscape, predictive accuracy must be coupled with explainability to navigate the complexities of global economic volatility and corporate risk.

Author Contributions

Conceptualization, Seyed J. Tabatabaei and Mohammad M. Mousavi; methodology, Seyed J. Tabatabaei; data curation, Seyed J. Tabatabaei; writing—original draft preparation, Seyed J. Tabatabaei; writing—review and editing, Mohammad M. Mousavi; All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data is available per request.

Conflicts of Interest

One of the authors serves as a Guest Editor for the Special Issue to which this manuscript has been submitted. To avoid any potential conflict of interest, the editorial handling of this paper was delegated to an independent editor, and the author had no involvement in the peer-review process, decision-making, or selection of reviewers for this manuscript. All editorial procedures were conducted in accordance with the journal’s standard policies to ensure an objective and unbiased review process.

References

  1. Acosta González, E.; Fernández Rodríguez, F.; Ganga, H. Predicting corporate financial failure using macroeconomic variables and accounting data. Computational Economics 2019, 53(1), 227–257. [Google Scholar] [CrossRef]
  2. Baker, M.; Wurgler, J. Market Timing and Capital Structure. Journal of Financial Economics 2002, 65(2), 205–250. [Google Scholar] [CrossRef]
  3. Baum, C. F.; Chakraborty, A.; Liu, B. The impact of macroeconomic uncertainty on firms’ changes in financial leverage. International Journal of Finance & Economics 2010, 15(1), 22–30. [Google Scholar] [CrossRef]
  4. Breiman, L. Random Forests. Machine Learning 2001, 45(1), 5–32. [Google Scholar] [CrossRef]
  5. Bernanke, B. S.; Gertler, M.; Gilchrist, S. The Financial Accelerator in a Quantitative Business Cycle Framework. In Handbook of Macroeconomics; 1999; Vol. 1, pp. 271–357. [Google Scholar] [CrossRef]
  6. Berninger, M.; Fiesenig, B.; Schiereck, D. The performance of corporate bond issuers in times of financial crisis: empirical evidence from Latin America. The Journal of Risk Finance 2021. [Google Scholar] [CrossRef]
  7. Blickle, K.; Santos, J. A. C. The costs of corporate debt overhang. Journal of Financial Intermediation 2024, 60, 101026. [Google Scholar] [CrossRef]
  8. Black, F.; Scholes, M. The pricing of options and corporate liabilities. In World Scientific Reference on Contingent Claims Analysis in Corporate Finance; 2019; pp. 3–21. [Google Scholar] [CrossRef]
  9. Campbell, J.Y.; Hilscher, J.; Szilagyi, J. In Search of Distress Risk. The Journal of Finance 2008, 63, 2899–2939. [Google Scholar] [CrossRef]
  10. Caporale, G. M.; Karanasos, M.; Yfanti, S. Macro financial linkages in the high frequency domain: Economic fundamentals and the Covid induced uncertainty channel in US and UK financial markets. International Journal of Finance & Economics 2024, 29(2), 1581–1608. [Google Scholar] [CrossRef]
  11. Chen, Z.; Lien, D.; Lin, Y. Sentiment: The bridge between financial markets and macroeconomy. Journal of Economic Behavior & Organization 2021, 188, 1177–1190. [Google Scholar] [CrossRef]
  12. Chen, Tianqi; Guestrin, Carlos. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’16), 2016; pp. 785–794. [Google Scholar] [CrossRef]
  13. Chan, C. P.; Tang, F. K.; Yang, J. H. Financial Performance Prediction and Stability Analysis Using SHAP Enhanced Machine Learning Models. Computational Economics 2026, 1–22. [Google Scholar] [CrossRef]
  14. CHAN, C. P.; TSAI, C. H.; TANG, F. K.; YANG, J. H. A SHAP Based Comparative Analysis of Machine Learning Model Interpretability in Financial Classification Tasks. Journal of Applied Economic Sciences 2025, 20(3). [Google Scholar] [CrossRef] [PubMed]
  15. Cheong, C.; Hoang, H. V. Macroeconomic factors or firm specific factors? An examination of the impact on corporate profitability before, during and after the global financial crisis. Cogent Economics & Finance 2021, 9(1), 1959703. [Google Scholar] [CrossRef]
  16. Choi, S. B.; Sauka, K.; Lee, M. Dynamic capital structure adjustment: an integrated analysis of firm specific and macroeconomic factors in Korean firms. International Journal of Financial Studies 2024, 12(1), 26. [Google Scholar] [CrossRef]
  17. Diamond, D. W.; Dybvig, P. H. Bank Runs, Deposit Insurance, and Liquidity. Journal of Political Economy 1983, 91(3), 401–419. [Google Scholar] [CrossRef]
  18. Ding, S.; Cui, T.; Bellotti, A. G.; Abedin, M. Z.; Lucey, B. The role of feature importance in predicting corporate financial distress in pre and post COVID periods: Evidence from China. International Review of Financial Analysis 2023, 90, 102851. [Google Scholar] [CrossRef]
  19. Ergenç, C.; Aktaş, R. Sector specific financial forecasting with machine learning algorithm and SHAP interaction values. Financial Internet Quarterly 2025, 21(1), 42–66. [Google Scholar] [CrossRef]
  20. European Systemic Risk Board (ESRB). Systemic liquidity risk: a monitoring framework; 2025. [Google Scholar] [CrossRef]
  21. Geanakoplos, J. Leverage Cycle Theory of Economic Crises and Booms. Oxford Research Encyclopedia of Economics and Finance 2024. [Google Scholar] [CrossRef]
  22. Huang, Y. P.; Yen, M. F. A new perspective of performance comparison among machine learning algorithms for financial distress prediction. Applied Soft Computing 2019, 83, 105663. [Google Scholar] [CrossRef]
  23. Homapour, E.; Su, L.; Caraffini, F.; Chiclana, F. Regression analysis of macroeconomic conditions and capital structures of publicly listed British firms. Mathematics 2022, 10(7), 1119. [Google Scholar] [CrossRef]
  24. Hui, X. F.; Sun, J. An application of support vector machine to companies’ financial distress prediction. In International Conference on Modeling Decisions for Artificial Intelligence; Springer, 2006; pp. 274–282. [Google Scholar]
  25. Ibrahimov, O.; Vancsura, L.; Parádi Dolgos, A. The impact of macroeconomic factors on the firm’s performance—Empirical analysis from Türkiye. Economies 2025, 13(4), 111. [Google Scholar] [CrossRef]
  26. Jensen, M. C.; Meckling, W. H. Theory of the Firm: Managerial Behavior, Agency Costs, and Capital Structure. Journal of Financial Economics 1976, 3(4), 305–360. [Google Scholar] [CrossRef]
  27. Kalantonis, P.; Kallandranis, C.; Sotiropoulos, M. Leverage and firm performance: new evidence on the role of economic sentiment using accounting information. Journal of Capital Markets Studies 2021, 5(1), 96–107. [Google Scholar] [CrossRef]
  28. Kalash, I. The financial leverage–financial performance relationship in the emerging market of Turkey: the role of financial distress risk and currency crisis. Eurasian Journal of Business and Economics 2023, 11(22), 59–81. [Google Scholar] [CrossRef]
  29. Karanasos, M.; Yfanti, S.; Hunter, J. Emerging stock market volatility and economic fundamentals: The importance of US uncertainty spillovers, financial and health crises. Annals of Operations Research 2022, 313(2), 1077–1116. [Google Scholar] [CrossRef]
  30. Kirilenko, A.; Kyle, A. S.; Samadi, M.; Tuzun, T. The Flash Crash: High-frequency trading in an electronic market. The Journal of Finance 2017, 72(3), 967–998. [Google Scholar] [CrossRef]
  31. Khandani, A. E.; Kim, A. J.; Lo, A. W. Machine learning for credit risk: Recent advances and challenges. Annual Review of Financial Economics 2024, 16, 123–150. [Google Scholar] [CrossRef]
  32. Kiyotaki, N.; Moore, J. Credit Cycles. Journal of Political Economy 1997, 105(6), 2113–2148. [Google Scholar] [CrossRef]
  33. Labosova, V.; Duricova, L.; Durana, P. One Model Fits All? Evaluating Bankruptcy Prediction Across Different Economic Periods. Economies 2025, 13(12), 361. [Google Scholar] [CrossRef]
  34. Lee, C. C.; Lee, C. C.; Ning, S. L. Dynamic relationship of oil price shocks and country risks. Energy Economics 2017, 66, 571–581. [Google Scholar] [CrossRef]
  35. Li, C.; Zhang, Y. The role of firm size in corporate debt distress during COVID-19. Finance Research Letters 2023, 53, 103577. [Google Scholar] [CrossRef]
  36. Liu, Y. Design of XGBoost prediction model for financial operation fraud of listed companies. International Journal of System Assurance Engineering and Management 2023, 14(6), 2354–2364. [Google Scholar] [CrossRef]
  37. Liu, T.; Zhao, W. Climate Shocks, Stock Price Crash Risk, and Corporate Sustainability: Evidence from China’s Financial System. Systems 2026, 14(1), 18. [Google Scholar] [CrossRef]
  38. Lundberg, S. M.; Lee, S. I. A Unified Approach to Interpreting Model Predictions. In Proceedings of the 34th International Conference on Machine Learning (ICML), 2017; pp. 2322–2331. [Google Scholar]
  39. Mienye, E.; Jere, N.; Obaido, G.; Mienye, I. D.; Aruleba, K. Deep Learning in Finance: A Survey of Applications and Techniques. AI 2024. [Google Scholar] [CrossRef]
  40. Myers, S. C.; Majluf, N. S. Corporate Financing and Investment Decisions When Firms Have Information That Investors Do Not. Journal of Financial Economics 1984, 13(3), 187–221. [Google Scholar] [CrossRef]
  41. Minsky, H. P. Stabilizing an Unstable Economy; Yale University Press: New Haven, CT, 1986. [Google Scholar]
  42. Merton, R. C. On the Pricing of Corporate Debt: The Risk Structure of Interest Rates. The Journal of Finance 1974, 29(2), 449–470. [Google Scholar] [CrossRef]
  43. Nowicki, J.; Ratajczak, P.; Szutowski, D. Impact of macroeconomic factors on financial liquidity of companies: a moderation analysis. Sustainability 2024, 16(11), 4483. [Google Scholar] [CrossRef]
  44. Pradhan, S.-K.; Stebunovs, V.; Takáts, E.; Temesvary, J. Geopolitics Meets Monetary Policy: Decoding Their Impact on Cross-Border Bank Lending. In International Finance Discussion Papers 1403; Board of Governors of the Federal Reserve System, 2025. [Google Scholar] [CrossRef]
  45. Qi, R. Enterprise financial distress prediction based on machine learning and SHAP interpretability analysis. In Proceedings of the 2025 International Conference on Artificial Intelligence and Digital Finance; 2025; pp. 76–79. [Google Scholar]
  46. Sahin, O. Firm specific and macroeconomic determinants of capital structure: evidence from fragile five countries. Eurasian Journal of Business and Economics 2018, 11(22), 59–81. [Google Scholar] [CrossRef]
  47. Tanaka, K.; Higashide, T.; Kinkyo, T.; Hamori, S. Financial Mechanisms of Corporate Bankruptcy: Are They Different or Similar Across Crises? Risks 2025, 13(8), 158. [Google Scholar] [CrossRef]
  48. Tiwari, R. K. D.; Bansal, Nitin P.; Arvind, L. Graph Neural Networks in Complex Data Pattern Recognition. Zenodo (CERN European Organization for Nuclear Research) 2025. [Google Scholar] [CrossRef]
  49. Tran, K. L.; Le, H. A.; Nguyen, T. H.; Nguyen, D. T. Explainable machine learning for financial distress prediction: evidence from Vietnam. Data 2022, 7(11), 160. [Google Scholar] [CrossRef]
  50. Wang, Y.; Wei, W.; Liu, Z.; Liu, J.; Lv, Y.; Li, X. Interpretable machine learning framework for corporate financialization prediction: A SHAP based analysis of high dimensional data. Mathematics 2025, 13(15), 2526. [Google Scholar] [CrossRef]
  51. Wang, X.; Jin, Z. Multi-region infectious disease prediction modeling based on spatio-temporal graph neural network and the dynamic model. PLoS Computational Biology 2025. [Google Scholar] [CrossRef]
  52. Yan, D.; Chi, G.; Lai, K. K. Financial distress prediction and feature selection in multiple periods by lassoing unconstrained distributed lag non-linear models. Mathematics 2020, 8(8), 1275. [Google Scholar] [CrossRef]
  53. Yang, L.-W.; Binh, N. T. T.; Yi, J. M. Advanced Techniques for Financial Distress Prediction. Forecasting 2026, 8(1), 2. [Google Scholar] [CrossRef]
  54. Yang, L.-W.; Binh, N. T. T.; Yi, J. M. Advanced Techniques for Financial Distress Prediction. Forecasting 2026, 8(1), 2. [Google Scholar] [CrossRef]
  55. Zhang, Y.; Sánchez Arnau, E.; Sánchez Pérez, E. A. Impact of geopolitical and international trade dynamics on corporate vulnerability and insolvency risk: A graph-based approach. Information 2025, 16(7), 525. [Google Scholar] [CrossRef]
Figure 1. Longitudinal Evolution of S&P 100 Aggregate Financial Health (Mean DD) across Economic Regimes (2000–2025).
Figure 1. Longitudinal Evolution of S&P 100 Aggregate Financial Health (Mean DD) across Economic Regimes (2000–2025).
Preprints 206008 g001
Figure 2. Random Forest Diagnostic Analysis: Actual vs. Predicted DD and Global Feature Importance.
Figure 2. Random Forest Diagnostic Analysis: Actual vs. Predicted DD and Global Feature Importance.
Preprints 206008 g002
Figure 3. XGBoost Statistical Diagnostics: Residual Distribution and Error Independence Analysis.
Figure 3. XGBoost Statistical Diagnostics: Residual Distribution and Error Independence Analysis.
Preprints 206008 g003
Figure 4. Feature influence in global financial crisis.
Figure 4. Feature influence in global financial crisis.
Preprints 206008 g004
Figure 5. Feature influence in Ukraine war.
Figure 5. Feature influence in Ukraine war.
Preprints 206008 g005
Table 1. Variable Definitions and Data Sources.
Table 1. Variable Definitions and Data Sources.
Variable Category Variable Name Proxy / Calculation Method
Dependent Variable Distance to Default Merton Structural Outcome
Firm-Specific Market Value of Equity Price × Shares Outstanding
Firm-Specific Total Debt Short-term + Long-term Debt
Market Signal S&P 500 Index Daily Closing Level
Volatility Market Fear Gauge CBOE Volatility Index
Macroeconomic Inflation Rate Consumer Price Index YoY %
Monetary Risk-Free Rate 10-Year Treasury Yield
Table 2. Diagnostic Metrics and Hypothesis Validation for Structural Credit Risk.
Table 2. Diagnostic Metrics and Hypothesis Validation for Structural Credit Risk.
Diagnostic Metric Observed Value Hypothesis Status Financial Interpretation
Total Observations 552,713 - High-frequency longitudinal panel
Convergence Rate (%) 100% Verified Numerical stability of the solver
Mean DD 10.8494 Robust High safety margin of S&P 100
Mean PD 0.0027 Low Minimal idiosyncratic default risk
Debt vs DD Correlation -0.2679 Confirmed Leverage erodes financial health
Market Cap vs DD Correlation 0.0796 Confirmed Scale serves as a protective buffer
Insolvent Observations (DD < 0) 165 Rare Detection of extreme tail-risk events
Table 3. Comparative Performance and Statistical Diagnostics of Ensemble Models.
Table 3. Comparative Performance and Statistical Diagnostics of Ensemble Models.
Evaluation Metric Random Forest (Test) XGBoost (Test) Overfitting Gap (%) Statistical Interpretation
R-squared ( R 2 ) 0.8653 0.9184 0.30% Variance explanation of DD
Mean Absolute Error (MAE) 1.6284 1.2229 1.43% Point-estimation accuracy
Root Mean Squared Error (RMSE) 2.4489 1.9061 0.82% Robustness to outliers
Durbin-Watson (DW) 1.9845 1.9982 - Absence of autocorrelation
Mean of Residuals 0.0031 0.0020 - Absence of systematic bias
Table 4. Regime-Specific Model Performance and SHAP Feature Influence Analysis.
Table 4. Regime-Specific Model Performance and SHAP Feature Influence Analysis.
Period Observations R2 MAE RMSE Durbin Watson Skewness Kurtosis JB (P-value) SHAP
Total Debt Numeric
SHAP
10 Y Interest Rate
SHAP Inflation CPI SHAPVIX SHAP S&P500
Full Period 552713 0.923 1.188 1.854 0.032 0.277 7.523 0.000 2.785 0.374 1.105 0.878 0.815
GFC 2008 32846 0.908 1.146 1.850 0.041 0.315 6.392 0.000 3.089 0.222 0.881 1.501 1.731
COVID 2020 48406 0.916 0.850 1.232 0.060 0.568 3.629 0.000 2.143 0.881 0.612 1.003 0.537
Ukraine 2022 45105 0.868 0.869 1.256 0.042 0.216 2.744 0.000 2.224 0.203 0.663 0.740 0.586
Normal Period 70898 0.920 1.398 1.990 0.032 0.137 3.033 0.000 2.593 0.372 1.235 0.694 0.443
Table 5. Feature Metrics and Model Robustness.
Table 5. Feature Metrics and Model Robustness.
Feature / Metric VIF / Mean Value Std. Dev / Interpretation
Total_Debt_Numeric 1.258 Low Multicollinearity
Interest_Rate_10Y 0.819 Low Multicollinearity
Inflation_CPI 0.769 Low Multicollinearity
VIX 1.015 Low Multicollinearity
S&P 500 2.475 Moderate Multicollinearity
K-Fold Mean $R^2$ 0.9190 0.0021 (High Stability)
Residual Skewness 0.276 Near-Symmetric Distribution
Residual Kurtosis 7.522 Leptokurtic (Robustness)
BP Test F-Statistic 2200.77 p = 0.000 (Heteroscedasticity)
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated