Submitted:
08 March 2025
Posted:
10 March 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Requirement Specification
2.1. Interpretation Tools
2.2. Purposes of Interpretation Tools
- i.
- Enabling the deployer to reject or correct training or test data on the basis of insufficient data quality. Data quality issues can include: unwanted confounding, data imbalance, bias, presence of noise, artifacts, and outliers.
- ii.
- Enabling the deployer to reject or correct an AI model on the basis of insufficient training data quality or inappropriate model behavior. Inappropriate model behavior can include: unwanted reliance on confounding information in data, unacceptable levels of uncertainty, bias, unfair decision making.
- iii.
- Enabling the deployer to reject, scrutinize (e.g. by cross-checking with the output of a second model or the opinion of a human expert), or correct outputs of a model on a given input or group of inputs on the basis of insufficient quality of test inputs, test inputs being outside the training distribution, inacceptable levels of uncertainty, or other reasons.
- iv.
- Selecting certain training or test data inputs, or input dimensions, for further inspection, e.g., to confirm the presence of noise or artifacts in inputs, or to assess the predictive value of individual input dimensions.
- v.
- Recommending certain dimensions of a test input for external intervention, for example with the goal of simulating a model’s output (e.g., a credit risk score) or predicting a real-world quantity approximated by the model (e.g., a health outcome) in a counterfactual setting.
3.3. Requirements on Interpretation Tools
- i.
- The intended purpose of the tool.
- ii.
-
The information provided by the tool, defined as the concrete interpretation of the tool’s output. The output shall correspond to well-defined unambigous properties of the AI system in general or its components.Example: A tool may provide 95% confidence intervals for outputs of an AI system.
- iii.
- A logically sound line of argument stating how the provided information enables the tool to fulfill its intended purpose when used by the deployer according to provided instructions.
- iv.
- The technical constraints including assumptions on the components of the AI system (e.g., model class, training data, test input) affecting the accuracy and precision of the tool with respect to providing correct information about the AI system or its components and with respect to serving its intended purpose.
- v.
-
The expected accuracy and precision of the tool with respect to providing correct information, and the expected accuracy and precision of the tool with respect to serving its intended purpose.Reported accuracies and precisions shall be based on either of the following, or both:
- Theoretical guarantees taking into account the technical constraints and assumptions of the tool and the properties of the AI system and its components.
-
Empirical results obtained using large enough and sufficiently representative sets of test inputs.Example: Uncertainties are typically required be well-calibrated. For a tool providing 95% confidence intervals for the outputs of an AI system, this would mean that the true value to be approximated by the model output (which is uknown during deployment) is contained in the provided interval for 95% of the test inputs. Thus, evidence should be provided that the provided confidence intervals have this property to a sufficient accuracy and precision.
- vi.
- Instructions on when and how to use the tool, including instructions on how to act upon observing the tool’s output to fulfill its purpose.
- vii.
-
A risk assessment including the discussion of possible failure modes of the tool and possible consequences of failures for the appropriate use of the AI system.The following information should also be provided:
- viii.
- (A reference to) the technical specification of the interpretation tool.
- ix.
- Technical details of the experiments conducted to determine the accuracy and precision of the information provided by the tool and to determine its fitness for purpose.
- x.
- Details on the derivation of theoretical guarantees for the accuracy and precision of the information provided by the tool and for its fitness for purpose.
3. Conclusions
References
- http://data.europa.eu/eli/reg/2024/1689/oj.
- https://standards.cencenelec.eu/dyn/www/f?p=205:22:0::::FSP_ORG_ID,FSP_LANG_ID:2916257,25&cs=1827B89DA69577BF3631EE2B6070F207D.
- DIN SPEC 92001-3:2023-04, 2023. Artificial intelligence – life cycle processes and quality requirements – part 3: Explainability.
- Hüllermeier, E., & Waegeman, W. (2021). Aleatoric and epistemic uncertainty in machine learning: An introduction to concepts and methods. Machine learning, 110(3), 457-506. [CrossRef]
- Molnar, C. (2020). Interpretable machine learning. Lulu. com.
- Weber, R. O., Johs, A. J., Goel, P., & Silva, J. M. (2024). XAI is in trouble. AI Magazine, 45(3), 300-316. [CrossRef]
- Haufe, S., Wilming, R., Clark, B., Zhumagambetov, R., Panknin, D., & Boubekki, A. Position: XAI needs formal notions of explanation correctness. In Interpretable AI: Past, Present and Future.
- Haufe, S., Meinecke, F., Görgen, K., Dähne, S., Haynes, J. D., Blankertz, B., & Bießmann, F. (2014). On the interpretation of weight vectors of linear models in multivariate neuroimaging. Neuroimage, 87, 96-110. [CrossRef]
- Wilming, R., Kieslich, L., Clark, B., & Haufe, S. (2023, July). Theoretical behavior of XAI methods in the presence of suppressor variables. In International Conference on Machine Learning (pp. 37091-37107). PMLR.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
