Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

SPHN Strategy to Unravel the Semantic Drift Between Versions of Standard Terminologies

Version 1 : Received: 7 December 2023 / Approved: 7 December 2023 / Online: 7 December 2023 (10:28:20 CET)

How to cite: Unni, D.; Touré, V.; Krauss, P.; Crameri, K.; Österle, S. SPHN Strategy to Unravel the Semantic Drift Between Versions of Standard Terminologies. Preprints 2023, 2023120508. https://doi.org/10.20944/preprints202312.0508.v1 Unni, D.; Touré, V.; Krauss, P.; Crameri, K.; Österle, S. SPHN Strategy to Unravel the Semantic Drift Between Versions of Standard Terminologies. Preprints 2023, 2023120508. https://doi.org/10.20944/preprints202312.0508.v1

Abstract

The Swiss Personalized Health Network has developed a national framework for enabling the semantic representation of health data within a Knowledge Graph. This framework has been implemented in all Swiss university hospitals, promoting seamless sharing and integration of clinical routine data with other health-related data, including omics and clinical research data. While research projects often have flexibility in selecting terminologies and specific versions, historical clinical routine data are typically coded for billing or administrative purposes using predefined terminologies in various (sometimes even unknown) versions over time. Some of these terminologies do not adhere to best practices for ontology design, presenting significant challenges to the retrospective re-use of such coded data for research. Common issues with these terminologies include the lack of machine-readable traceability across versions and non-adherence to FAIR principles. Terms from older versions frequently disappear in newer ones, making it challenging to distinguish outdated from invalid terms. Additionally, 'semantic drift' occurs, where the meaning of terms changes across versions. To address these challenges, we have implemented FAIR and historized versions of ATC, CHOP, and ICD-10-GM. We represent each version in RDF using versioned URIs and track meaning changes between versions in a machine-readable way using OWL and RDFS. The integration of these historized terminologies into our quality control framework, based on SHACLs, enables comprehensive data quality control in hospitals and empowers researchers to effectively utilize this data. Our work aims to bridge the gap between health data coded in different terminology versions, ensuring a consistent and reliable semantic representation.

Keywords

standards, ontologies; semantic drift; versioning; semantic web; FAIR principles

Subject

Biology and Life Sciences, Other

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.