Submitted:
29 December 2025
Posted:
31 December 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Related Studies
2.1. The Concept of Network Modelling for ESG Risk
2.2. Theory Approach
2.2.1. Stakeholder Theory
2.2.2. Regulatory Spillover Theory
2.3. Hypothesis Formulation
2.4. Research Gap
3. Research Design
3.1. Multi-Method Research Approach
3.2. Data Volume
3.3. Sample Size Selection Procedure
3.4. Network-Based ESG Dataset Structure
3.5. Measurement of Variables
3.6. A Multilayer Network Econometric and AI Framework for ESG Modelling
- Network-Based Baseline Model
- 2.
- Supervised and Graph-Based Learning
- 3.
- Unsupervised Learning and Anomaly Detection
4. Results
4.1. Descriptive Analysis
4.2. Test of Hypotheses


5. Discussion and Implications
5.1. Discussion
5.2. Implications for Sustainability Accounting and Regulation
6. Conclusions
Appendix A
| Rank | Country | Share of Events |
| 1 | United States | 26.4% |
| 2 | United Kingdom | 4.7% |
| 3 | Russia | 3.9% |
| 4 | India | 3.5% |
| 5 | China | 3.0% |
| 6 | Israel | 2.5% |
| 7 | Nigeria | 2.0% |
| 8 | France | 2.0% |
| 9 | Canada | 1.9% |
| 10 | Australia | 1.8% |
| Total | 51.7% |
| Rank | Media Source | Country | Share of Events |
| 1 | MSN | United States | 3.1% |
| 2 | Reuters | United Kingdom | 0.9% |
| 3 | Love Radio | United States | 0.9% |
| 4 | Daily Mail | United Kingdom | 0.5% |
| 5 | Yahoo | United States | 0.4% |
| 6 | The Times of India | India | 0.4% |
| 7 | Pan-African Network | South Africa | 0.4% |
| 8 | Houston Chronicle | United States | 0.3% |
| 9 | Washington Times | United States | 0.3% |
| 10 | San Francisco Chronicle | United States | 0.3% |
| Total | 7.9% |
| Industry / Region | Asia-Pacific | Europe | North America | Latin America | Africa | Middle East | Total |
| Manufacturing | 3,030 | 1,700 | 1,450 | 500 | 250 | 750 | 7,680 |
| Tech & Electronics | 1,770 | 950 | 820 | 190 | 130 | 380 | 4,240 |
| Transport & Logistics | 630 | 560 | 500 | 190 | 130 | 250 | 2,260 |
| Agriculture & Commodities | 540 | 320 | 290 | 350 | 290 | 260 | 2,050 |
| Energy & Extractives | 440 | 220 | 190 | 220 | 220 | 220 | 1,510 |
| Retail & Consumer Goods | 630 | 500 | 560 | 220 | 160 | 130 | 2,200 |
| Pharma & Chemicals | 440 | 380 | 380 | 130 | 60 | 130 | 1,520 |
| Financial & Business Services | 250 | 440 | 280 | 100 | 30 | 160 | 1,260 |
| Construction & Engineering | 220 | 220 | 160 | 60 | 30 | 190 | 880 |
| Automotive & Mobility | 440 | 630 | 530 | 130 | 30 | 130 | 1,890 |
| Total | 8,390 | 5,220 | 5,160 | 2,190 | 1,330 | 2,600 | 25,500 |
| Variable | Symbol | Observations | Mean | Std. Dev. | Min | Max |
| Environmental Risk | 25,000 | 2.84 | 1.21 | 0.00 | 9.40 | |
| Social Risk | 25,000 | 3.12 | 1.44 | 0.00 | 10.20 | |
| Governance Risk | 25,000 | 2.57 | 1.18 | 0.00 | 8.30 | |
| Shipment Volume (log) | 2,500,000 | 7.89 | 1.96 | 0.00 | 15.21 | |
| Trade Dependency | 25,000 | 0.41 | 0.22 | 0.05 | 0.98 | |
| Negative News Sentiment | 1,200,000 | –5.87 | 12.44 | –90.00 | 85.00 | |
| Network Position | 25,000 | 0.28 | 0.15 | 0.01 | 0.89 | |
| ESG Incident Count (Lagged) | 25,000 | 1.72 | 3.64 | 0.00 | 47.00 | |
| Regional Context | 25,000 | 0.56 | 0.18 | 0.13 | 0.91 |
| Test | Statistic | p-value |
| Moran’s I (ESG risk) | 0.218 | 0.000 |
| LM-lag | 31.44 | 0.000 |
| Robust LM-lag | 24.37 | 0.000 |
| LM-error | 18.22 | 0.000 |
| Robust LM-error | 7.11 | 0.008 |
| Wald Test (ρ ≠ 0) | 12.58 | 0.000 |
| Regressor |
SLM Coef. (SE) |
SDM Coef. (SE) |
Panel FE Model Coef. (SE) |
| Network Effects | |||
| Network-lag ESG risk ₜ) | 0.314* (0.098) | 0.271 (0.104) | 0.284* (0.072) |
| Spatial lag of Shipment Volume ( | 0.083 (0.034) | ||
| Spatial lag of Eigenvector | 0.067 (0.031) | ||
| Spatial lag of Regional Context | –0.212* (0.081) | ||
| Centrality Measures | |||
| Eigenvector centrality | 0.141* (0.053) | 0.128 (0.051) | 0.097 (0.038) |
| Firm-Level ESG Predictors | |||
| Negative Sentiment | 0.021* (0.006) | ||
| ESG Incident Count | 0.063* (0.014) | ||
| Shipment Volume (log) | 0.112* (0.025) | 0.097* (0.026) | 0.052 (0.019) |
| Trade Dependency | 0.129 (0.191) | ||
| Contextual Controls | |||
| Regional Context | –0.764* (0.203) | –0.689* (0.214) | |
| Model Features & Fit | |||
| Firm FE | Included | ||
| Year FE | Included | ||
| Spatial parameter ρ | 0.233* | 0.217* | |
| Pseudo- / Within | 0.21 | 0.27 | 0.23 |
| Observations | 25,000 firms | 25,000 firms | 200,000 firm-years (25,000 × 8 years) |
| Regressor |
Environmental Risk Std. β (AME) |
Social Risk Std. β (AME) |
Governance Risk Std. β (AME) |
| Dependent Variables: ESG Risk | |||
| Degree centrality | 0.241* (0.052) | 0.389* (0.071) | 0.118 (0.031) |
| Betweenness centrality | 0.524* (0.118) | 0.312* (0.059) | 0.388* (0.082) |
| Eigenvector centrality | 0.417* (0.094) | 0.518* (0.102) | 0.289* (0.066) |
| Clustering coefficient | 0.233* (0.047) | 0.095* (0.018) | 0.315* (0.056) |
| Constant | 0.025 | 0.018 | 0.021 |
| Model Fit | |||
| R² | 0.578 | 0.567 | 0.522 |
| Observations | 200,000 | 200,000 | 200,000 |
| Model | ΔAUC |
DeLong Z |
DeLong p-value |
McNemar χ² |
McNemar p-value |
95% CI for ΔAUC |
| Logistic Regression | 0.183 | 6.12 | <0.001 | 21.8 | <0.001 | [0.142, 0.214] |
| Random Forest | 0.218 | 7.55 | <0.001 | 34.1 | <0.001 | [0.181, 0.253] |
| Gradient Boosting | 0.164 | 5.01 | <0.001 | 18.6 | <0.001 | [0.129, 0.196] |
| XGBoost | 0.184 | 6.44 | <0.001 | 26.4 | <0.001 | [0.144, 0.223] |
| Robustness Dimension | Metric | FinBERT | RoBERTa | mBERT |
| A. Cross-Model Alignment | Pearson correlation (vs. FinBERT) | 1.00 | 0.84 | 0.72 |
| Spearman rank correlation (vs. FinBERT) | 1.00 | 0.81 | 0.69 | |
| B. Inferential Stability (Logit) | Lagged sentiment coefficient (β) | −0.87*** | −0.76*** | −0.55** |
| Odds ratio | 0.42 | 0.47 | 0.58 | |
| 95% CI for odds ratio | [0.38, 0.46] | [0.43, 0.52] | [0.41, 0.82] | |
| C. Predictive Contribution (Random Forest) | AUC, controls only | 0.51 | 0.51 | 0.51 |
| AUC, controls + sentiment | 0.73 | 0.71 | 0.66 | |
| ΔAUC | +0.22 | +0.20 | +0.15 | |
| 95% CI for ΔAUC | [0.18, 0.26] | [0.16, 0.24] | [0.10, 0.20] | |
| D. Classification Reliability | Polarity flip rate | 3% | 7% | 10% |
| Sentiment variance | Low | Moderate | High | |
| E. Extreme Shock Sensitivity | Incident odds ratio (bottom-decile sentiment) |
10.3× | 8.9× | 6.1× |
| 95% CI for shock odds ratio | [8.9, 11.9] | [7.4, 10.7] | [4.8, 7.9] |
References
- Carvalho, V. M., Nirei, M., Saito, Y. U., & Tahbaz-Salehi, A. (2021). Supply chain disruptions. Quarterly Journal of Economics, 136(4), 2051–2121. [CrossRef]
- Berg, F., Kölbel, J. F., & Rigobon, R. (2022). Aggregate confusion: The divergence of ESG ratings. Review of Finance, 26(6), 1315–1344. [CrossRef]
- Acemoglu, D., Carvalho, V. M., Ozdaglar, A., & Tahbaz-Salehi, A. (2012). The network origins of aggregate fluctuations. Econometrica, 80(5), 1977–2016. [CrossRef]
- MacCarthy, B. L., Ahmed, W. A. H., & Demirel, G. (2022). Mapping the supply chain: Why, what, and how? International Journal of Production Economics, 250, 108688. [CrossRef]
- Wei, X., Xu, J., Zeng, C., Chen, Y., & colleagues. (2024). Gone with the chain: The ripple effect of ESG performance in China’s industrial chain. Environmental Impact Assessment Review, 108(4), 107576. [CrossRef]
- Tan, Z., Liu, S., Liu, Q., Hu, M., Zhang, X., Wang, W., & Liu, B. (2025). Modeling ESG-driven industrial value chain dynamics using directed graph neural networks. Financial Innovation, 11, Article 83. [CrossRef]
- Bergier, I. (2025). Relational infrastructures for planetary health: Network governance and inner development in Brazil’s traceable beef export system. Challenges, 16(4), 48. [CrossRef]
- Angioni, S., Consoli, S., Dessì, D., & Salatino, A. A. (2024). Exploring environmental, social, and governance (ESG) discourse in news: An AI-powered investigation through knowledge graph analysis. IEEE Access. Advance online publication. [CrossRef]
- Brockmann, N., Kosasih, E. E., & Brintrup, A. (2022). Supply chain link prediction on uncertain knowledge graphs. ACM SIGKDD Explorations Newsletter, 24(2), 124–130. [CrossRef]
- Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). [CrossRef]
- Freeman, R. E. (1984). Strategic management: A stakeholder approach. Pitman.
- Vogel, D. (1995). Trading up: Consumer and environmental regulation in a global economy. Harvard University Press.
- Bradford, A. (2020). The Brussels effect: How the European Union rules the world. Oxford University Press.
- Khan, M., Serafeim, G., & Yoon, A. (2016). Corporate sustainability: First evidence on materiality. The Accounting Review, 91(6), 1697–1724. [CrossRef]
- Eccles, R. G., Ioannou, I., & Serafeim, G. (2014). The impact of corporate sustainability on organizational processes and performance. Management Science, 60(11), 2835–2857. [CrossRef]
- Guerrero, S., & Viteri, J. P. (2025). What do environmental, social, and governance scores measure? The role of outcome and impact indicators in ESG scores. Finance Research Letters, 72, 106529. [CrossRef]
- Kinnear, D., & Ogden, J. (2021). Network dependence in firm risk. Journal of Economic Dynamics and Control, 125, 104079. [CrossRef]
- Jackson, M. O. (2010). Social and economic networks. Princeton University Press.
- Buehler, K., Hunneman, A., & Perakis, S. (2022). ESG analytics and explainable AI. McKinsey Quarterly.
- Buehler, S., & Schädler, K. (2022). Machine learning for ESG risk prediction. Journal of Sustainable Finance & Investment.
- Kim, H., & Lee, J. (2022). Proposing an integrated approach to analyzing ESG data via machine learning and natural language processing. Sustainability, 14, 4456. [CrossRef]
- Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why should I trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1135–1144). [CrossRef]
- Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). [CrossRef]
- Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems (pp. 4768–4777).
- Magaletti, N., Notarnicola, V., Di Molfetta, M., Mariani, S., & Leogrande, A. (2025). Logistics performance and the three pillars of ESG. Sustainability, 17(24), 11370. [CrossRef]
- The GDELT Project. (2025). GDELT 2.0 event database. https://www.gdeltproject.org.
- Leetaru, K., & Schrodt, P. (2013). GDELT: Global data on events, language, and tone, 1979–2012. International Studies Association Annual Conference, San Diego, CA.
- S&P Global. (2023). Panjiva: Global trade and supply-chain intelligence. https://www.spglobal.com/marketintelligence/en/solutions/panjiva.
- Lavin, J. F., & Montecinos-Pearce, A. A. (2021). ESG reporting: Empirical analysis of the influence of board heterogeneity from an emerging market. Sustainability, 13(6), 3090. [CrossRef]
- Kipf, T. N., & Welling, M. (2017). Semi-supervised classification with graph convolutional networks. In Proceedings of the 5th International Conference on Learning Representations (ICLR) (pp. 1–14). https://arxiv.org/abs/1609.02907.
- Brockmann, N., Kosasih, E. E., & Brintrup, A. M. (2022). Supply chain link prediction on uncertain knowledge graphs. ACM SIGKDD Explorations Newsletter, 24(2), 124–130. [CrossRef]
- McLachlan, G. J., & Peel, D. (2000). Finite mixture models. Wiley. [CrossRef]
- Liu, F. T., Ting, K. M., & Zhou, Z.-H. (2012). Isolation-based anomaly detection. ACM Transactions on Knowledge Discovery from Data, 6(1), Article 3. [CrossRef]
- Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313(5786), 504–507. [CrossRef]
- Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.
- Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 NAACL-HLT Conference (pp. 4171–4186). [CrossRef]
- Hong, D., Fu, Z., Zhang, X., & Pan, Y. (2025). Research on the development and application of the GDELT event database. Data, 10(10), 158. [CrossRef]
- Tian, M., Li, S., Cao, X., & Wang, G. (2025). Network analysis of volatility spillovers between environmental, social, and governance (ESG) rating stocks: Evidence from China. Mathematics, 13(10), 1586. [CrossRef]
- Okamoto, K., Chen, W., & Li, X.-Y. (2008). Ranking of closeness centrality for large-scale social networks. In Frontiers in Algorithmics (FAW 2008) (pp. 186–195). Springer. [CrossRef]
- Derrien, F., Krueger, P., Landier, A., & Yao, T. (2022). ESG news, future cash flows, and firm value (Working paper).
- Hassan Nassar, O. M., Jafari, F., & Jain, C. (2025). From news to knowledge: Leveraging AI and knowledge graphs for real-time ESG insights. Sustainability, 17(24), 11128. [CrossRef]
- Kim, M., Kang, J., Jeon, I., Lee, J., Park, J., Youm, S., Jeong, J., Woo, J., & Moon, J. (2024). Differential impacts of environmental, social, and governance news sentiment on corporate financial performance in the global market. Electronics, 13(22), 4507. [CrossRef]










| Author(s) | Year | Topic | Dataset(s) Used | Decision-Making Relevance | Major Empirical Contributions |
| [5] | 2024 | Gone with the Chain: The Ripple Effect of ESG Performance in China’s Industrial Chain | Chinese industrial supply-chain network (1164 industries from ChinaScope, 2018–2020) combined with firm-level ESG ratings (Sino-Securities Index) | Informs firms and investors on how ESG performance shocks propagate through industrial networks and affect downstream performance | Develops a graph neural network with cross-attention to model ESG spillovers; shows ESG performance diffuses through supply-chain links and significantly influences profitability and value-chain outcomes |
| [6] | 2025 | Modeling ESG-Driven Industrial Value Chain Dynamics Using Directed Graph Neural Networks | Chinese industrial value-chain network (ChinaScope) combined with China Securities Index ESG ratings | Supports corporate strategy and policy design by identifying asymmetric upstream and downstream ESG vulnerabilities | Proposes a directed GNN distinguishing inbound and outbound flows; demonstrates that ESG shocks propagate asymmetrically and shape industrial value extension and network resilience |
| [7] | 2025 | Relational Infrastructures for Planetary Health in Brazil’s Traceable Beef Export System | Brazilian beef supply-chain network linking ranches and meatpacking facilities, augmented with transport and traceability data | Enables investors and firms to detect hidden deforestation and ESG risks embedded in indirect suppliers | Uses network analysis to uncover indirect sourcing and governance gaps; highlights how relational infrastructure conditions ESG risk and traceability in agricultural supply chains |
| [8] | 2024 | ESG Discourse in News: An AI-Powered Knowledge Graph Analysis | Dow Jones News Article dataset processed using NLP and knowledge-graph construction | Supports real-time reputational and ESG risk monitoring for firms and regulators | Constructs ESG knowledge graphs from news using transformer models; demonstrates how ESG narratives evolve and signal emerging risks |
| [9] | 2022 | Supply Chain Link Prediction on Uncertain Knowledge Graphs | Multi-tier supply-chain knowledge graph extracted from web data using NLP (VersedAI) | Enhances ESG compliance and supply-chain risk management by improving visibility across hidden tiers | Combines NLP-extracted graphs with GNN-based link prediction under uncertainty; advances multi-tier supply-chain mapping for proactive risk mitigation |
| [10] | 2024 | SHIELD: LLM-Driven Schema Induction for EV Battery Supply-Chain Disruptions | Open-source textual data on EV battery supply chains, mined using zero-shot large language models | Supports strategic sourcing and ESG risk oversight in critical-mineral and EV supply chains | Proposes an LLM-based framework for schema induction and knowledge-graph construction; enables early detection of disruption and ESG risk in multi-tier supply chains |
| Data Source | Data Type | Period Covered |
| FactSet Revere | Supplier–customer relationships; industry classifications | 2003–2024 |
| Panjiva (S&P Global) | Shipment-level import/export transactions | 2007–2024 |
| GDELT Global Knowledge Graph | ESG-related news events; sentiment metadata | 2015–2024 |
| RepRisk ESG Incident Database | Environmental, social and governance controversy records | 2007–2024 |
| Worldwide Governance Indicators (WGI) | Country-level institutional governance measures | 2003–2024 |
| Component | Description | Source / Method | Unit / Notes |
| Adjacency Matrix | Directed network built from verified supplier–buyer links. Shipment data used only to strengthen tie weights when available. | FactSet Revere (relationship direction); Panjiva (trade volumes used as optional weights) | Firm/firm edges; weight = 1 for FactSet-only ties, or shipment-based weight when available |
| Degree Centrality | Number of direct incoming and outgoing ties a firm holds | Computed from adjacency matrix | Firm-year |
| Betweenness Centrality | Extent to which a firm sits on shortest paths linking other firms | Graph-theoretic calculation | Firm-year |
| Eigenvector Centrality | Measures influence based on connection to well-positioned firms | Graph-theoretic calculation | Firm-year |
| Clustering Coefficient | Proportion of a firm’s neighbours that are connected to one another | Graph algorithm | Firm-year |
| ESG Event Count | Annual count of ESG-related news events linked to each firm | GDELT event extraction | Aggregated by firm-year |
| Sentiment Index | Average tone of ESG-related coverage | Transformer-based NLP analysis | Yearly mean sentiment score per firm |
| ESG Incident Severity | Weighted score reflecting intensity of documented ESG controversies | RepRisk incident database | Firm-year severity index |
| Shipment Anomaly Score | Annual measure of irregular trade behaviour | Autoencoder + Isolation Forest models | Mapped to firms based on shipment ownership |
| Governance Context (WGI) | Country-level institutional quality matched to each firm's headquarters | World Governance Indicators | Year matched to nearest available WGI release |
| Final Analytical Structure | Combined panel dataset integrating network, ESG and governance variables | Harmonised across all systems | Panel: firm × year (2003–2024) |
| Variable | Symbol | Type | Data Source(s) | Operational Definition |
| Environmental Risk | Dependent | RepRisk; GDELT (environment topics) | Annual index combining the frequency and severity of environmental controversies, with supplementary signals from GDELT event themes. | |
| Social Risk | Dependent | RepRisk; GDELT (labour & social themes) | Measure of exposure to labour, community and human-rights issues, based on severity-weighted incidents and news-event counts. | |
| Governance Risk | Dependent | RepRisk; WGI | Score derived from governance-related incidents (fraud, corruption) adjusted by country-level governance indicators. | |
| Shipment Volume | Independent | Panjiva (S&P Global) | Log-transformed count of inbound and outbound shipments for firm i in year t. | |
| Trade Dependency | Independent | FactSet Revere; Panjiva | Index reflecting reliance on cross-border suppliers and customers, constructed from supplier concentration ratios and the share of foreign trade partners. | |
| Negative News Sentiment | Independent | GDELT | Average annual sentiment score of ESG-related coverage, weighted by firm-specific event volume. | |
| Network Position | Moderator | Graph metrics; GNN embeddings | Composite structural indicator capturing influence, brokerage and local connectivity based on centrality metrics and learned embeddings. | |
| Lagged ESG Incident Count | Independent (lagged) | RepRisk | Number of ESG incidents recorded for firm i in the prior year. | |
| Regional Context | Control | WGI; HDI; regulatory indices | Normalised index summarising governance quality, regulatory strength and socio-economic conditions in the firm’s home country. |
| S/N | Variable | VIF | (1) | (2) | (3) | (4) | (5) | (6) | (7) | (8) | (9) |
| (1) | 2.41 | 1.000 | |||||||||
| (2) | 2.18 | 0.623 | 1.000 | ||||||||
| (3) | 2.07 | 0.482 | 0.552 | 1.000 | |||||||
| (4) | 1.36 | –0.118 | –0.094 | –0.153 | 1.000 | ||||||
| (5) | 1.52 | 0.217 | 0.182 | 0.143 | 0.308 | 1.000 | |||||
| (6) | 2.33 | 0.458 | 0.507 | 0.392 | –0.062 | 0.114 | 1.000 | ||||
| (7) | 1.71 | 0.281 | 0.309 | 0.263 | 0.218 | 0.366 | 0.187 | 1.000 | |||
| (9) | 2.49 | 0.709 | 0.643 | 0.577 | –0.041 | 0.169 | 0.492 | 0.324 | 1.000 | ||
| (9) | 1.88 | –0.327 | –0.405 | –0.459 | 0.082 | –0.124 | –0.269 | –0.157 | –0.386 | 1.000 |
| Regressor | Coefficient | Std. Error | t-value | p-value |
|
Constant |
5.982 | 1.642 | 3.644 | 0.000 |
| Network Variables | ||||
| Network-lag ESG risk ( ) | 0.351 | 0.137 | 2.563 | 0.011 |
| Degree centrality ( ) | –0.018 | 0.016 | –1.123 | 0.262 |
| Betweenness centrality ( ) | 0.094 | 0.039 | 2.410 | 0.016 |
| Eigenvector centrality ( ) | 0.157 | 0.061 | 2.573 | 0.011 |
| Firm-Level Controls | ||||
| Shipment Volume (log) ( ) | 0.129 | 0.027 | 4.778 | 0.000 |
| Trade Dependency ( ) | 0.166 | 0.228 | 0.728 | 0.468 |
| Contextual Controls | ||||
| Regional Context ( ) | –0.881 | 0.219 | –4.024 | 0.000 |
| Model Fit Statistics | ||||
| R² | 0.189 | |||
| Adjusted R² | 0.188 | |||
| F-statistic | 54.62 | 0.000 | ||
| Observations | 200,000 |
| Regressor |
Environmental Coef. (SE) |
Social Coef. (SE) |
Governance Coef. (SE) |
| Degree centrality | 0.207* (0.067) | 0.370* (0.067) | 0.139 (0.068) |
| Betweenness centrality | 0.472* (0.062) | 0.286* (0.062) | 0.383* (0.062) |
| Eigenvector centrality | 0.373* (0.064) | 0.474* (0.064) | 0.276* (0.065) |
| Clustering coefficient | 0.212* (0.058) | 0.096* (0.058) | 0.295* (0.059) |
| Constant | 0.025 (0.057) | 0.018 (0.057) | 0.021 (0.058) |
| Model Fit | |||
| R² | 0.578 | 0.567 | 0.522 |
| Observations | 200,000 | 200,000 | 200,000 |
| Model / Specification | β (Sentimentₜ₋₁) | SE | p-value |
| Dependent Variable: ESG Incident Count | |||
| (1) FE-OLS + Driscoll–Kraay SE (continuous incidents) | −0.398*** | 0.052 | <0.001 |
| (2) Poisson FE (count outcome) | −0.211*** | 0.031 | <0.001 |
| (3) Negative Binomial FE (over dispersed counts) | −0.236*** | 0.039 | <0.001 |
| (4) PPML FE (robust to zeros and heteroskedasticity) | −0.224*** | 0.034 | <0.001 |
| (5) FE-OLS DK with Sentiment | −0.173** | 0.069 | 0.013 |
| (6a) FE-OLS DK with Sentiment | −0.311*** | 0.060 | <0.001 |
| (6b) FE-OLS DK with Sentiment | −0.089* | 0.048 | 0.067 |
| (7) FE-OLS DK, sentiment deciles (Bottom | −0.452*** | 0.083 | <0.001 |
| (8) FE-OLS DK, high-centrality subsample | −0.427*** | 0.071 | <0.001 |
| (9) FE-OLS DK, low-centrality subsample | −0.213** | 0.093 | 0.024 |
| (10) Placebo: Sentimentₜ₊₁ → Incidents | −0.021 | 0.047 | 0.658 |
| Variable |
Logit: Incident Occurrence (t) |
AME (Logit) |
Poisson: Incident Count (t) |
Cox PH: Time to First Incident |
| Sentiment Shock (t–1) | β = 2.28*** (0.03) OR = 9.81 |
+0.20* | β = 0.94*** (0.01) IRR = 2.56 |
β = 0.41*** (0.07) HR = 1.51 |
| log (Size) | β = −0.04** (0.02) OR = 0.96 |
−0.01** | β = −0.03*** (0.01) IRR = 0.97 |
β = −0.07** (0.03) HR = 0.93 |
| Network Centrality | β = 0.53*** (0.05) OR = 1.69 |
+0.06*** | β = 0.17*** (0.02) IRR = 1.19 |
β = 0.35*** (0.04) HR = 1.42 |
| Model diagnostics | Within R² = 0.64 | Pseudo-R² = 0.34 | PH assumption not rejected |
| Model | Feature Set | ROC–AUC | AUPRC | Balanced Accuracy | F1 Score |
| Dependent variable: Incident_any | |||||
| Logistic Regression | Controls only | 0.51 | 0.29 | 0.50 | 0.50 |
| Controls + sentiment | 0.77 | 0.48 | 0.65 | 0.62 | |
| Random Forest | Controls only | 0.51 | 0.30 | 0.50 | 0.52 |
| Controls + sentiment | 0.79 | 0.54 | 0.68 | 0.66 | |
| Gradient Boosting | Controls only | 0.49 | 0.28 | 0.50 | 0.51 |
| Controls + sentiment | 0.66 | 0.44 | 0.64 | 0.63 | |
| XGBoost | Controls only | 0.52 | 0.31 | 0.51 | 0.52 |
| Controls + sentiment | 0.80 | 0.52 | 0.66 | 0.64 |
| Model Category | Model | ROC–AUC | Precision | Recall | F1 Score | Accuracy | Brier Score |
| Econometric models | Logistic Regression | 0.71 | 0.63 | 0.66 | 0.64 | 0.65 | 0.212 |
| Poisson GLM | 0.69 | 0.60 | 0.62 | 0.61 | 0.63 | 0.226 | |
| Fixed-Effects Logit | 0.73 | 0.64 | 0.67 | 0.65 | 0.66 | 0.207 | |
| Machine-learning models | Random Forest | 0.76 | 0.66 | 0.69 | 0.67 | 0.68 | 0.189 |
| Gradient Boosting (GBM) | 0.77 | 0.67 | 0.70 | 0.68 | 0.69 | 0.184 | |
| XGBoost | 0.81 | 0.69 | 0.72 | 0.70 | 0.72 | 0.168 | |
| Graph-based AI | GNN (GraphSAGE / GAT) | 0.87 | 0.74 | 0.76 | 0.75 | 0.76 | 0.149 |
| Model Category | Model | Brier Score | Log-Loss | MAE | RMSE |
| Machine-learning models | Random Forest | 0.189 [0.182, 0.196] | 0.544 [0.530, 0.558] | 0.296 | 0.423 |
| Gradient Boosting | 0.184 [0.177, 0.191] | 0.538 [0.524, 0.552] | 0.287 | 0.417 | |
| XGBoost | 0.168 [0.161, 0.175] | 0.511 [0.497, 0.525] | 0.271 | 0.398 | |
| MLP Neural Network | 0.161 [0.155, 0.167] | 0.499 [0.486, 0.512] | 0.263 | 0.387 | |
| Graph-based AI | Graph Neural Network (GNN) | 0.149 [0.143, 0.155] | 0.472 [0.459, 0.485] | 0.249 | 0.368 |
| Model | Threshold | TP | FP | FN | TN | Precision | Recall | FPR | FNR |
| XGBoost | 0.42 | 1,710 | 1,980 | 890 | 20,420 | 0.46 | 0.66 | 0.09 | 0.34 |
| Graph Neural Network (GNN) | 0.39 | 1,930 | 1,640 | 670 | 20,760 | 0.54 | 0.74 | 0.07 | 0.26 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
