Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Explainable Graph Neural Networks: An Application to Open Statistics Knowledge Graphs for Estimating House Prices

Version 1 : Received: 30 April 2024 / Approved: 30 April 2024 / Online: 1 May 2024 (07:33:52 CEST)

How to cite: Karamanou, A.; Brimos, P.; Kalampokis, E.; Tarabanis, K. Explainable Graph Neural Networks: An Application to Open Statistics Knowledge Graphs for Estimating House Prices. Preprints 2024, 2024050037. https://doi.org/10.20944/preprints202405.0037.v1 Karamanou, A.; Brimos, P.; Kalampokis, E.; Tarabanis, K. Explainable Graph Neural Networks: An Application to Open Statistics Knowledge Graphs for Estimating House Prices. Preprints 2024, 2024050037. https://doi.org/10.20944/preprints202405.0037.v1

Abstract

In the rapidly evolving field of real estate economics, the prediction of house prices continues to be a complex challenge, intricately tied to a multitude of socio-economic factors. However, traditional predictive models have often overlooked the spatial interdependencies that play a vital role in shaping housing prices. This study applies Graph Neural Networks (GNNs) on Open Statistics Knowledge Graphs to model spatial dependencies and predict house prices across Scotland’s 2011 data zones. To this end, integrated statistical indicators are retrieved from the official Scottish Open Government Data portal. The three representative GNN algorithms employed - ChebNet, GCN, and GraphSAGE - demonstrate higher prediction accuracy than traditional models, including the tabular-based XGBoost and a simple Multi-Layer Perceptron (MLP). In addition, local and global explainability are employed to increase transparency and trust in the predictions made by the most accurate GNN - GraphSAGE. The global feature importance is determined by a logistic regression surrogate model while the local, region-level understanding of the GNN predictions is achieved through the use of GNNExplainer. Explainaibility results are compared with those from a previous work that applied the XGBoost machine learning algorithm and the SHapley Additive exPlanations (SHAP) explainability framework on the same dataset. Interestingly, both global surrogate model and the SHAP approach underscored the Comparative Illness Factor, a health indicator, and the ratio of detached dwellings as the most crucial features in the global explainability. In the case of local explanations, while both methods showed similar results, the GNN approach provided a richer, more comprehensive understanding of the predictions for two specific data zones.

Keywords

Linked Statistical Data; Knowledge Graphs; Graph Neural Networks; Explainable Artificial Intelligence; House Price Prediction; Explainable Graph Neural Networks

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.