Submitted:
30 March 2025
Posted:
31 March 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Background and Related Work
2.1. Traditional Explainability Techniques
- Feature Importance Methods: Techniques such as SHAP (Shapley Additive Explanations) [24] and LIME (Local Interpretable Model-agnostic Explanations) [25] are designed to attribute importance to input features by approximating local decision boundaries or distributing contributions based on cooperative game theory principles [26].
- Surrogate Models: Interpretable models, such as decision trees or linear regressions, are trained to mimic the behavior of complex models in an effort to generate human-understandable explanations [30?].
2.2. Challenges in Explaining Large Language Models
- Opacity and Lack of Interpretability: Unlike simpler models, the internal representations of LLMs are not easily understandable by humans, making it difficult to extract meaningful explanations for their predictions [36].
- Scale and Computational Complexity: The sheer size of modern LLMs makes traditional feature attribution methods computationally expensive and often impractical for real-time interpretability [37].
- Contextual Dependencies: Unlike structured machine learning models, LLMs rely on sequential token dependencies, making it difficult to attribute decisions to a specific input token or phrase [38].
- Bias and Ethical Concerns: LLMs are prone to biases inherited from training data, which can manifest in outputs in unpredictable ways, highlighting the need for transparency in their decision-making processes [39?].
2.3. Recent Advances in XAI for LLMs
- Attention-based Interpretability: Analyzing attention weights in transformer models has been a popular method to infer how LLMs process and prioritize information [41]. However, attention does not necessarily equate to explanation, as model behavior is influenced by complex interactions beyond attention scores [42?] .
- Concept-based Explanations: Methods such as TCAV (Testing with Concept Activation Vectors) [?] aim to identify and attribute high-level human-understandable concepts in LLM representations, bridging the gap between black-box models and interpretable reasoning [43].
- Causal Analysis Techniques: Causal inference methods seek to disentangle causal relationships within LLMs by identifying which internal components contribute most significantly to specific outputs [46?].
2.4. Ethical and Societal Implications of Explainability in LLMs
- Fairness and Bias Mitigation: Explainability techniques can help identify biases in LLMs, enabling interventions to reduce discriminatory outputs [25?].
- Accountability and Transparency: Regulatory frameworks increasingly demand that AI decisions be explainable, particularly in high-stakes domains such as healthcare and law [52?].
2.5. Summary and Research Gaps
3. Methodologies for Explainable AI in LLMs
3.1. Feature Attribution Methods
3.2. Attention-based Analysis
3.3. Concept-based Explanations
3.4. Counterfactual and Contrastive Explanations
3.5. Causal Analysis and Model Distillation
3.6. Human-centered and Interactive Explainability
- Natural Language Explanations: Generating explanations in human-readable text to facilitate understanding and transparency [77].
4. Challenges and Open Research Questions
4.1. Scalability and Computational Constraints
4.2. Faithfulness and Reliability of Explanations
4.3. Interpretability vs . Performance Trade-offs
4.4. User-Centric and Domain-Specific Explanations
- What role does human feedback play in refining and validating LLM explanations [103]?
4.5. Mitigating Bias and Ethical Concerns
4.6. Future Directions in Explainable AI for LLMs
5. Conclusion and Future Research Directions
5.1. Summary of Key Findings
- Traditional XAI methods such as SHAP, LIME, and attention-based mechanisms provide insights into model behavior but are often insufficient for the complexity of LLMs [115].
- Emerging techniques, including causal analysis, concept-based explanations, and counterfactual reasoning, offer promising directions for improving interpretability [116].
- Trade-offs between interpretability and performance remain a significant hurdle, necessitating new methods that balance fidelity and usability [119].
5.2. Future Research Directions
- Human-Centric Explanations: Designing adaptive and interactive XAI systems that provide explanations tailored to specific user needs and domains [122].
- Causal and Counterfactual Approaches: Improving causal inference techniques to provide more meaningful and actionable explanations [123].
- Regulatory and Ethical Frameworks: Establishing guidelines and best practices to ensure that XAI aligns with legal and ethical standards [124].
- Integration with Human Feedback: Enhancing XAI techniques through active learning and human-in-the-loop approaches to refine explanations dynamically [125].
5.3. Final Thoughts
References
- Wang, X.; Kim, H.; Rahman, S.; Mitra, K.; Miao, Z. Human-LLM Collaborative Annotation Through Effective Verification of LLM Labels. In Proceedings of the Proceedings of the CHI Conference on Human Factors in Computing Systems, New York, NY, USA, 2024; CHI ’24, pp. 1–21. [CrossRef]
- Hassija, V.; Chamola, V.; Mahapatra, A.; Singal, A.; Goel, D.; Huang, K.; Scardapane, S.; Spinelli, I.; Mahmud, M.; Hussain, A. Interpreting black-box models: a review on explainable artificial intelligence. Cognitive Computation 2024, 16, 45–74. [CrossRef]
- Arras, L.; Horn, F.; Montavon, G.; Müller, K.R.; Samek, W. "What is relevant in a text document?": An interpretable machine learning approach. PloS one 2017, 12, e0181142. [CrossRef]
- DeYoung, J.; Jain, S.; Rajani, N.F.; Lehman, E.; Xiong, C.; Socher, R.; Wallace, B.C. ERASER: A benchmark to evaluate rationalized NLP models. arXiv preprint arXiv:1911.03429 2019. arXiv:1911.03429 2019.
- Holliday, D.; Wilson, S.; Stumpf, S. User trust in intelligent systems: A journey over time. In Proceedings of the Proceedings of the 21st International Conference on Intelligent User Interfaces, 2016, pp. 164–168.
- Maliha, G.; Gerke, S.; Cohen, I.G.; Parikh, R.B. Artificial Intelligence and Liability in Medicine. The Milbank Quarterly 2021, 99, 629–647. [CrossRef] [PubMed]
- Doshi-Velez, F.; Kim, B. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 2017. arXiv:1702.08608 2017.
- Zhang, G.; Kashima, H. Learning state importance for preference-based reinforcement learning. Machine Learning 2023, pp. 1–17.
- Nwakanma, C.I.; Ahakonye, L.A.C.; Njoku, J.N.; Odirichukwu, J.C.; Okolie, S.A.; Uzondu, C.; Ndubuisi Nweke, C.C.; Kim, D.S. Explainable Artificial Intelligence (XAI) for intrusion detection and mitigation in intelligent connected vehicles: A review. Applied Sciences 2023, 13, 1252. [CrossRef]
- Rahman, M.; Polunsky, S.; Jones, S. Transportation policies for connected and automated mobility in smart cities. In Smart Cities Policies and Financing; Elsevier, 2022; pp. 97–116.
- Aubin Le Quéré, M.; Schroeder, H.; Randazzo, C.; Gao, J.; Epstein, Z.; Perrault, S.T.; Mimno, D.; Barkhuus, L.; Li, H. LLMs as Research Tools: Applications and Evaluations in HCI Data Work. In Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, Honolulu HI USA, 2024; pp. 1–7. [CrossRef]
- Chen, V.; Liao, Q.V.; Wortman Vaughan, J.; Bansal, G. Understanding the role of human intuition on reliance in human-AI decision-making with explanations. Proceedings of the ACM on Human-Computer Interaction 2023, 7, 1–32. [CrossRef]
- Linardatos, P.; Papastefanopoulos, V.; Kotsiantis, S. Explainable AI: A review of machine learning interpretability methods. Entropy 2020, 23, 18. [CrossRef]
- Harren, T.; Matter, H.; Hessler, G.; Rarey, M.; Grebner, C. Interpretation of structure–activity relationships in real-world drug design data sets using explainable artificial intelligence. Journal of Chemical Information and Modeling 2022, 62, 447–462. [CrossRef]
- Gunning, D.; Aha, D. DARPA’s Explainable Artificial Intelligence (XAI) program. AI Magazine 2019, 40, 44–58. [CrossRef]
- Nourani, M.; Kabir, S.; Mohseni, S.; Ragan, E.D. The effects of meaningful and meaningless explanations on trust and perceived system accuracy in intelligent systems. In Proceedings of the Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, 2019, Vol. 7, pp. 97–105.
- Wei, J.; Wang, X.; Schuurmans, D.; Bosma, M.; Xia, F.; Chi, E.; Le, Q.V.; Zhou, D.; et al. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 2022, 35, 24824–24837.
- Adhikari, A.; Tax, D.M.J.; Satta, R.; Faeth, M. LEAFAGE: Example-based and Feature importance-based Explanations for Black-box ML models. In Proceedings of the 2019 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), New Orleans, LA, USA, 2019; pp. 1–7. [CrossRef]
- Guidotti, R.; Monreale, A.; Ruggieri, S.; Turini, F.; Giannotti, F.; Pedreschi, D. A survey of methods for explaining black box models. ACM Computing Surveys (CSUR) 2018, 51, 1–42. [CrossRef]
- Miller, T. Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence 2019, 267, 1–38. [CrossRef]
- Jie, Y.W.; Satapathy, R.; Mong, G.S.; Cambria, E.; et al. How Interpretable are Reasoning Explanations from Prompting Large Language Models? arXiv preprint arXiv:2402.11863 2024. arXiv:2402.11863 2024.
- Burton, S.; Habli, I.; Lawton, T.; McDermid, J.; Morgan, P.; Porter, Z. Mind the gaps: Assuring the safety of autonomous systems from an engineering, ethical, and legal perspective. Artificial Intelligence 2020, 279, 103201. [CrossRef]
- Hamamoto, R. Application of artificial intelligence for medical research, 2021.
- Lundberg, S.M.; Lee, S.I. A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems 2017, 30.
- Ribeiro, M.T.; Singh, S.; Guestrin, C. “Why should I trust you?” Explaining the predictions of any classifier. In Proceedings of the Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 1135–1144.
- Bian, Z.; Xia, S.; Xia, C.; Shao, M. Weakly supervised vitiligo segmentation in skin image through saliency propagation. In Proceedings of the 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 2019, pp. 931–934.
- Sundararajan, M.; Taly, A.; Yan, Q. Axiomatic attribution for deep networks. In Proceedings of the International Conference on Machine Learning. PMLR, 2017, pp. 3319–3328.
- Chamola, V.; Hassija, V.; Sulthana, A.R.; Ghosh, D.; Dhingra, D.; Sikdar, B. A review of trustworthy and Explainable Artificial Intelligence (XAI). IEEE Access 2023. [CrossRef]
- Marcinkevičs, R.; Vogt, J.E. Interpretability and explainability: A machine learning zoo mini-tour. arXiv preprint arXiv:2012.01805 2020.
- Atakishiyev, S.; Salameh, M.; Yao, H.; Goebel, R. Towards safe, explainable, and regulated autonomous driving. arXiv preprint arXiv:2111.10518 2021.
- Lopes, P.; Silva, E.; Braga, C.; Oliveira, T.; Rosado, L. XAI Systems Evaluation: A Review of Human and Computer-Centred Methods. Applied Sciences 2022, 12, 9423. [CrossRef]
- Qin, Y.; Song, D.; Chen, H.; Cheng, W.; Jiang, G.; Cottrell, G. A dual-stage attention-based recurrent neural network for time series prediction. arXiv preprint arXiv:1704.02971 2017.
- Zafar, M.R.; Khan, N. Deterministic local interpretable model-agnostic explanations for stable explainability. Machine Learning and Knowledge Extraction 2021, 3, 525–541. [CrossRef]
- Saraswat, D.; Bhattacharya, P.; Verma, A.; Prasad, V.K.; Tanwar, S.; Sharma, G.; Bokoro, P.N.; Sharma, R. Explainable AI for healthcare 5.0: opportunities and challenges. IEEE Access 2022.
- Li, L.; Xu, M.; Liu, H.; Li, Y.; Wang, X.; Jiang, L.; Wang, Z.; Fan, X.; Wang, N. A large-scale database and a CNN model for attention-based glaucoma detection. IEEE transactions on Medical Imaging 2019, 39, 413–424. [CrossRef]
- McInnes, L.; Healy, J.; Melville, J. UMAP: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426 2018.
- Mankodiya, H.; Jadav, D.; Gupta, R.; Tanwar, S.; Hong, W.C.; Sharma, R. Od-XAI: Explainable AI-based semantic object detection for autonomous vehicles. Applied Sciences 2022, 12, 5310. [CrossRef]
- Bano, M.; Zowghi, D.; Whittle, J. Exploring Qualitative Research Using LLMs 2023.
- van der Waa, J.; Nieuwburg, E.; Cremers, A.; Neerincx, M. Evaluating XAI: A comparison of rule-based and example-based explanations. Artificial Intelligence 2021, 291, 103404. [CrossRef]
- Wang, B.; Zhou, J.; Li, Y.; Chen, F. Impact of Fidelity and Robustness of Machine Learning Explanations on User Trust. In Proceedings of the Australasian Joint Conference on Artificial Intelligence. Springer, 2023, pp. 209–220.
- Regulation, P. Regulation (EU) 2016/679 of the European Parliament and of the Council. Regulation (eu) 2016, 679, 2016.
- Chern, S.; Chern, E.; Neubig, G.; Liu, P. Can Large Language Models be Trusted for Evaluation? Scalable Meta-Evaluation of LLMs as Evaluators via Agent Debate, 2024. arXiv:2401.16788 [cs].
- Krause, J.; Perer, A.; Ng, K. Interacting with predictions: Visual inspection of black-box machine learning models. In Proceedings of the Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, 2016, pp. 5686–5697.
- Wachter, S.; Mittelstadt, B.; Russell, C. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harv. JL & Tech. 2017, 31, 841.
- Anton, N.; Doroftei, B.; Curteanu, S.; Catãlin, L.; Ilie, O.D.; Târcoveanu, F.; Bogdănici, C.M. Comprehensive review on the use of artificial intelligence in ophthalmology and future research directions. Diagnostics 2022, 13, 100. [CrossRef]
- Ali, S.; Abuhmed, T.; El-Sappagh, S.; Muhammad, K.; Alonso-Moral, J.M.; Confalonieri, R.; Guidotti, R.; Del Ser, J.; Díaz-Rodríguez, N.; Herrera, F. Explainable Artificial Intelligence (XAI): What we know and what is left to attain Trustworthy Artificial Intelligence. Information fusion 2023, 99, 101805. [CrossRef]
- Job, S.; Tao, X.; Li, L.; Xie, H.; Cai, T.; Yong, J.; Li, Q. Optimal treatment strategies for critical patients with deep reinforcement learning. ACM Transactions on Intelligent Systems and Technology 2024, 15, 1–22. [CrossRef]
- Verma, S.; Boonsanong, V.; Hoang, M.; Hines, K.E.; Dickerson, J.P.; Shah, C. Counterfactual explanations and algorithmic recourses for machine learning: A review. arXiv preprint arXiv:2010.10596 2020.
- Hanawa, K.; Yokoi, S.; Hara, S.; Inui, K. Evaluation of Similarity-based Explanations, 2021. arXiv:2006.04528 [cs, stat].
- Kha, Q.H.; Le, V.H.; Hung, T.N.K.; Nguyen, N.T.K.; Le, N.Q.K. Development and Validation of an Explainable Machine Learning-Based Prediction Model for Drug–Food Interactions from Chemical Structures. Sensors 2023, 23, 3962. [CrossRef]
- Kojima, T.; Gu, S.S.; Reid, M.; Matsuo, Y.; Iwasawa, Y. Large Language Models are Zero-Shot Reasoners, 2023. arXiv:2205.11916 [cs].
- Farahat, A.; Reichert, C.; Sweeney-Reed, C.M.; Hinrichs, H. Convolutional neural networks for decoding of covert attention focus and saliency maps for EEG feature visualization. Journal of Neural Engineering 2019, 16, 066010. [CrossRef]
- Arrieta, A.B.; Díaz-Rodríguez, N.; Del Ser, J.; Bennetot, A.; Tabik, S.; Barbado, A.; García, S.; Gil-López, S.; Molina, D.; Benjamins, R.; et al. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion 2020, 58, 82–115. [CrossRef]
- Zeiler, M.D.; Fergus, R. Visualizing and understanding convolutional networks. In Proceedings of the Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I 13. Springer, 2014, pp. 818–833.
- Yang, G.; Ye, Q.; Xia, J. Unbox the black-box for the medical explainable AI via multi-modal and multi-centre data fusion: A mini-review, two showcases and beyond. Information Fusion 2022, 77, 29–52. [CrossRef]
- Castelnovo, A.; Depalmas, R.; Mercorio, F.; Mombelli, N.; Potertì, D.; Serino, A.; Seveso, A.; Sorrentino, S.; Viola, L. Augmenting XAI with LLMs: A Case Study in Banking Marketing Recommendation. In Proceedings of the Explainable Artificial Intelligence; Longo, L.; Lapuschkin, S.; Seifert, C., Eds., Cham, 2024; pp. 211–229. [CrossRef]
- Oviedo, F.; Ferres, J.L.; Buonassisi, T.; Butler, K.T. Interpretable and explainable machine learning for materials science and chemistry. Accounts of Materials Research 2022, 3, 597–607. [CrossRef]
- Lertvittayakumjorn, P.; Toni, F. Human-grounded evaluations of explanation methods for text classification. arXiv preprint arXiv:1908.11355 2019.
- Kolla, M.; Salunkhe, S.; Chandrasekharan, E.; Saha, K. LLM-Mod: Can Large Language Models Assist Content Moderation? In Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, Honolulu HI USA, 2024; pp. 1–8. [CrossRef]
- Ribeiro, M.T.; Singh, S.; Guestrin, C. Anchors: High-precision model-agnostic explanations. In Proceedings of the Proceedings of the AAAI Conference on Artificial Intelligence, 2018, Vol. 32.
- Albahri, A.; Duhaim, A.M.; Fadhel, M.A.; Alnoor, A.; Baqer, N.S.; Alzubaidi, L.; Albahri, O.; Alamoodi, A.; Bai, J.; Salhi, A.; et al. A systematic review of trustworthy and Explainable Artificial Intelligence in healthcare: Assessment of quality, bias risk, and data fusion. Information Fusion 2023. [CrossRef]
- Rudin, C.; Radin, J. Why are we using black box models in AI when we don’t need to? A lesson from an explainable AI competition. Harvard Data Science Review 2019, 1, 1–9. [CrossRef]
- El Naqa, I.; Murphy, M.J. What is machine learning?; Springer, 2015.
- Huang, Z.; Yao, X.; Liu, Y.; Dumitru, C.O.; Datcu, M.; Han, J. Physically explainable CNN for SAR image classification. ISPRS Journal of Photogrammetry and Remote Sensing 2022, 190, 25–37. [CrossRef]
- Sundararajan, M.; Taly, A.; Yan, Q. Axiomatic Attribution for Deep Networks. In Proceedings of the Proceedings of the 34th International Conference on Machine Learning; Precup, D.; Teh, Y.W., Eds. PMLR, 06–11 Aug 2017, Vol. 70, Proceedings of Machine Learning Research, pp. 3319–3328.
- Fisher, R.A. The Use of Multiple Measurements in Taxonomic Problems. Annals of Eugenics 1936, 7, 179–188. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1469-1809.1936.tb02137.x. [CrossRef]
- Munir, M. Thesis approved by the Department of Computer Science of the TU Kaiserslautern for the award of the Doctoral Degree doctor of engineering. PhD thesis, Kyushu University, Japan, 2021.
- Hu, T.; Zhou, X.H. Unveiling LLM Evaluation Focused on Metrics: Challenges and Solutions. arXiv preprint arXiv:2404.09135 2024.
- Puiutta, E.; Veith, E.M. Explainable reinforcement learning: A survey. In Proceedings of the International Cross-domain Conference for Machine Learning and Knowledge Extraction. Springer, 2020, pp. 77–95.
- Ma, S.; Chen, Q.; Wang, X.; Zheng, C.; Peng, Z.; Yin, M.; Ma, X. Towards Human-AI Deliberation: Design and Evaluation of LLM-Empowered Deliberative AI for AI-Assisted Decision-Making, 2024. arXiv:2403.16812 [cs].
- Weller, A. Transparency: motivations and challenges. In Explainable AI: interpreting, explaining and visualizing deep learning; Springer, 2019; pp. 23–40.
- Yilma, B.A.; Kim, C.M.; Cupchik, G.C.; Leiva, L.A. Artful Path to Healing: Using Machine Learning for Visual Art Recommendation to Prevent and Reduce Post-Intensive Care Syndrome (PICS). In Proceedings of the Proceedings of the CHI Conference on Human Factors in Computing Systems, 2024, pp. 1–19.
- Atakishiyev, S.; Salameh, M.; Yao, H.; Goebel, R. Explainable artificial intelligence for autonomous driving: A comprehensive overview and field guide for future research directions. arXiv preprint arXiv:2112.11561 2021.
- Ismail, A.A.; Gunady, M.; Corrada Bravo, H.; Feizi, S. Benchmarking deep learning interpretability in time series predictions. Advances in Neural Information Processing Systems 2020, 33, 6441–6452.
- Sadeghi Tabas, S. Explainable Physics-informed Deep Learning for Rainfall-runoff Modeling and Uncertainty Assessment across the Continental United States 2023.
- Plumb, G.; Wang, S.; Chen, Y.; Rudin, C. Interpretable Decision Sets: A Joint Framework for Description and Prediction. In Proceedings of the Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2018, pp. 1677–1686.
- Crocker, J.; Kumar, K.; Cox, B. Using explainability to design physics-aware CNNs for solving subsurface inverse problems. Computers and Geotechnics 2023, 159, 105452. [CrossRef]
- Zhou, J.; Gandomi, A.H.; Chen, F.; Holzinger, A. Evaluating the quality of machine learning explanations: A survey on methods and metrics. Electronics 2021, 10, 593. [CrossRef]
- Weber, L.; Lapuschkin, S.; Binder, A.; Samek, W. Beyond explaining: Opportunities and challenges of XAI-based model improvement. Information Fusion 2023, 92, 154–176. [CrossRef]
- Hedström, A.; Weber, L.; Krakowczyk, D.; Bareeva, D.; Motzkus, F.; Samek, W.; Lapuschkin, S.; Höhne, M.M.C. Quantus: An explainable ai toolkit for responsible evaluation of neural network explanations and beyond. Journal of Machine Learning Research 2023, 24, 1–11.
- Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence 2019, 1, 206–215. [CrossRef]
- Schlegel, U.; Keim, D.A. Time series model attribution visualizations as explanations. In Proceedings of the 2021 IEEE Workshop on TRust and EXpertise in Visual Analytics (TREX). IEEE, 2021, pp. 27–31.
- Zniyed, Y.; Nguyen, T.P.; et al. Enhanced network compression through tensor decompositions and pruning. IEEE Transactions on Neural Networks and Learning Systems 2024.
- Alharin, A.; Doan, T.N.; Sartipi, M. Reinforcement learning interpretation methods: A survey. IEEE Access 2020, 8, 171058–171077. [CrossRef]
- Zhu, Y.; Zhou, Y.; Ye, Q.; Qiu, Q.; Jiao, J. Soft proposal networks for weakly supervised object localization. In Proceedings of the Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 1841–1850.
- Cooper, J.; Arandjelović, O.; Harrison, D.J. Believe the HiPe: Hierarchical perturbation for fast, robust, and model-agnostic saliency mapping. Pattern Recognition 2022, 129, 108743. [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Communications of the ACM 2017, 60, 84–90. [CrossRef]
- Tjoa, E.; Guan, C. A survey on Explainable Artificial Intelligence (XAI): Toward medical XAI. IEEE Transactions on Neural Networks and Learning Systems 2020, 32, 4793–4813. [CrossRef] [PubMed]
- Madhav, A.S.; Tyagi, A.K. Explainable Artificial Intelligence (XAI): connecting artificial decision-making and human trust in autonomous vehicles. In Proceedings of the Proceedings of Third International Conference on Computing, Communications, and Cyber-Security: IC4S 2021. Springer, 2022, pp. 123–136.
- Hoffman, R.R.; Mueller, S.T.; Klein, G.; Litman, J. Metrics for Explainable AI: Challenges and Prospects, 2019. arXiv:1812.04608 [cs].
- Springenberg, J.T.; Dosovitskiy, A.; Brox, T.; Riedmiller, M. Towards better analysis of deep convolutional neural networks. International Conference on Learning Representations (ICLR) 2015.
- Chowdhary, K.; Chowdhary, K. Natural language processing. Fundamentals of Artificial Intelligence 2020, pp. 603–649.
- Wang, Z.; Yan, W.; Oates, T. Time series classification from scratch with deep neural networks: A strong baseline. In Proceedings of the 2017 International joint Conference on Neural Networks (IJCNN). IEEE, 2017, pp. 1578–1585.
- Dai, J.; Upadhyay, S.; Aivodji, U.; Bach, S.H.; Lakkaraju, H. Fairness via explanation quality: Evaluating disparities in the quality of post hoc explanations. In Proceedings of the Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, 2022, pp. 203–214.
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 10012–10022.
- Zniyed, Y.; Nguyen, T.P.; et al. Efficient tensor decomposition-based filter pruning. Neural Networks 2024, 178, 106393.
- Huber, T.; Weitz, K.; André, E.; Amir, O. Local and global explanations of agent behavior: Integrating strategy summaries with saliency maps. Artificial Intelligence 2021, 301, 103571. [CrossRef]
- Anguita-Ruiz, A.; Segura-Delgado, A.; Alcalá, R.; Aguilera, C.M.; Alcalá-Fdez, J. eXplainable Artificial Intelligence (XAI) for the identification of biologically relevant gene expression patterns in longitudinal human studies, insights from obesity research. PLoS Computational Biology 2020, 16, e1007792. [CrossRef]
- Ye, Y.; Zhang, X.; Sun, J. Automated vehicle’s behavior decision making using deep reinforcement learning and high-fidelity simulation environment. Transportation Research Part C: Emerging Technologies 2019, 107, 155–170. [CrossRef]
- Chakraborty, S.; Tomsett, R.; Raghavendra, R.; Harborne, D.; Alzantot, M.; Cerutti, F.; Srivastava, M.; Preece, A.; Julier, S.; Rao, R.M.; et al. Interpretability of deep learning models: A survey of results. In Proceedings of the 2017 IEEE smartworld, ubiquitous intelligence & computing, advanced & trusted computed, scalable computing & communications, cloud & big data computing, Internet of people and smart city innovation (smartworld/SCALCOM/UIC/ATC/CBDcom/IOP/SCI). IEEE, 2017, pp. 1–6.
- Heuillet, A.; Couthouis, F.; Díaz-Rodríguez, N. Collective explainable AI: Explaining cooperative strategies and agent contribution in multiagent reinforcement learning with shapley values. IEEE Computational Intelligence Magazine 2022, 17, 59–71. [CrossRef]
- Ward, A.; Sarraju, A.; Chung, S.; Li, J.; Harrington, R.; Heidenreich, P.; Palaniappan, L.; Scheinker, D.; Rodriguez, F. Machine learning and atherosclerotic cardiovascular disease risk prediction in a multi-ethnic population. NPJ Digital Medicine 2020, 3, 125. [CrossRef]
- Li, J.; King, S.; Jennions, I. Intelligent Fault Diagnosis of an Aircraft Fuel System Using Machine Learning—A Literature Review. Machines 2023, 11, 481. [CrossRef]
- Bavaresco, A.; Bernardi, R.; Bertolazzi, L.; Elliott, D.; Fernández, R.; Gatt, A.; Ghaleb, E.; Giulianelli, M.; Hanna, M.; Koller, A.; et al. LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks. arXiv preprint arXiv:2406.18403 2024.
- Jain, S.; Wallace, B.C. Attention is not explanation. arXiv preprint arXiv:1902.10186 2019.
- Ribeiro, M.T.; Singh, S.; Guestrin, C. Model-agnostic interpretability of machine learning. arXiv preprint arXiv:1606.05386 2016.
- Nauta, M.; Trienes, J.; Pathak, S.; Nguyen, E.; Peters, M.; Schmitt, Y.; Schlötterer, J.; Van Keulen, M.; Seifert, C. From Anecdotal Evidence to Quantitative Evaluation Methods: A Systematic Review on Evaluating Explainable AI. ACM Computing Surveys 2023, 55, 1–42. [CrossRef]
- Fuhrman, J.D.; Gorre, N.; Hu, Q.; Li, H.; El Naqa, I.; Giger, M.L. A review of explainable and interpretable AI with applications in COVID-19 imaging. Medical Physics 2022, 49, 1–14. [CrossRef] [PubMed]
- Chaddad, A.; Peng, J.; Xu, J.; Bouridane, A. Survey of explainable AI techniques in healthcare. Sensors 2023, 23, 634. [CrossRef]
- Askr, H.; Elgeldawi, E.; Aboul Ella, H.; Elshaier, Y.A.; Gomaa, M.M.; Hassanien, A.E. Deep learning in drug discovery: an integrative review and future challenges. Artificial Intelligence Review 2023, 56, 5975–6037. [CrossRef]
- El-Sappagh, S.; Alonso, J.M.; Islam, S.R.; Sultan, A.M.; Kwak, K.S. A multilayer multimodal detection and prediction model based on explainable artificial intelligence for Alzheimer’s disease. Scientific Reports 2021, 11, 2660. [CrossRef]
- Loh, H.W.; Ooi, C.P.; Seoni, S.; Barua, P.D.; Molinari, F.; Acharya, U.R. Application of Explainable Artificial Intelligence for healthcare: A systematic review of the last decade (2011–2022). Computer Methods and Programs in Biomedicine 2022, p. 107161.
- Zhou, X.; Tang, J.; Lyu, H.; Liu, X.; Zhang, Z.; Qin, L.; Au, F.; Sarkar, A.; Bai, Z. Creating an authoring tool for K-12 teachers to design ML-supported scientific inquiry learning. In Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, 2024, pp. 1–7.
- Kim, T.S.; Lee, Y.; Shin, J.; Kim, Y.H.; Kim, J. EvalLM: Interactive Evaluation of Large Language Model Prompts on User-Defined Criteria. In Proceedings of the Proceedings of the CHI Conference on Human Factors in Computing Systems, 2024, pp. 1–21. arXiv:2309.13633 [cs]. [CrossRef]
- Minh, D.; Wang, H.X.; Li, Y.F.; Nguyen, T.N. Explainable Artificial Intelligence: a comprehensive review. Artificial Intelligence Review 2022, pp. 1–66.
- Feng, J.; Lansford, J.L.; Katsoulakis, M.A.; Vlachos, D.G. Explainable and trustworthy artificial intelligence for correctable modeling in chemical sciences. Science advances 2020, 6, eabc3204. [CrossRef]
- Shumway, R.H.; Stoffer, D.S.; Stoffer, D.S. Time series analysis and its applications; Vol. 3, Springer, 2000.
- Lipton, Z.C.; Kale, D.C.; Wetzel, R.; et al. Modeling missing data in clinical time series with rnns. Machine Learning for Healthcare 2016, 56, 253–270.
- Van Der Maaten, L. Accelerating t-SNE using tree-based algorithms. The Journal of Machine Learning Research 2014, 15, 3221–3245.
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
- Mankodiya, H.; Obaidat, M.S.; Gupta, R.; Tanwar, S. XAI-AV: Explainable artificial intelligence for trust management in autonomous vehicles. In Proceedings of the 2021 International Conference on Communications, Computing, Cybersecurity, and Informatics (CCCI). IEEE, 2021, pp. 1–5.
- Fan, F.L.; Xiong, J.; Li, M.; Wang, G. On interpretability of artificial neural networks: A survey. IEEE Transactions on Radiation and Plasma Medical Sciences 2021, 5, 741–760. [CrossRef] [PubMed]
- Langer, M.; Oster, D.; Speith, T.; Hermanns, H.; Kästner, L.; Schmidt, E.; Sesing, A.; Baum, K. What do we want from Explainable Artificial Intelligence (XAI)?–A stakeholder perspective on XAI and a conceptual model guiding interdisciplinary XAI research. Artificial Intelligence 2021, 296, 103473. [CrossRef]
- Dam, H.K.; Tran, T.; Ghose, A. Explainable software analytics. In Proceedings of the Proceedings of the 40th International Conference on Software Engineering: New Ideas and Emerging Results, 2018, pp. 53–56.
- Awotunde, J.B.; Adeniyi, E.A.; Ajamu, G.J.; Balogun, G.B.; Taofeek-Ibrahim, F.A. Explainable Artificial Intelligence in Genomic Sequence for Healthcare Systems Prediction. In Connected e-Health: Integrated IoT and Cloud Computing; Springer, 2022; pp. 417–437.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
