Working Paper Article Version 1 This version is not peer-reviewed

Assessing the Price in Data utility of k-Anonymous Microaggregation

Version 1 : Received: 22 July 2019 / Approved: 23 July 2019 / Online: 23 July 2019 (11:42:34 CEST)

How to cite: Rodriguez-Hoyos, A.; Estrada-Jiménez, J.; Rebollo-Monedero, D.; Forné, J.; Parra-Arnau, J.; Urquiza-Aguiar, L. Assessing the Price in Data utility of k-Anonymous Microaggregation. Preprints 2019, 2019070260 Rodriguez-Hoyos, A.; Estrada-Jiménez, J.; Rebollo-Monedero, D.; Forné, J.; Parra-Arnau, J.; Urquiza-Aguiar, L. Assessing the Price in Data utility of k-Anonymous Microaggregation. Preprints 2019, 2019070260

Abstract

With a data revolution underway for some time, there is an increasing demand for formal privacy protection mechanisms that are not so destructive. Hereof microaggregation is a popular high-utility approach designed to satisfy the popular k-anonymity criteria while applying low distortion to data. However, standard performance metrics are commonly based on mean square error, which will hardly capture the utility degradation related to a specific application domain of data. In this work, we evaluate the performance of k-anonymous microaggregation in terms of the loss in classification accuracy of the machine learned models built from perturbed data. Systematic experimentation is carried out on four microaggregation algorithms that are tested over four data sets. The empirical utility of the resulting microaggregated data is assessed using the learning algorithm that obtains the highest accuracy from original data. Validation tests are performed on a test set of non perturbed data. The results confirm k-anonymous microaggregation as a high-utility privacy mechanism in this context and distortion based on mean squared error as a poor predictor of practical utility. Finally, we corroborate the beneficial effects for empirical utility of exploiting the statistical properties of data when constructing privacy preserving algorithms.

Keywords

microaggregation; k-anonymity; privacy; data utility

Subject

Computer Science and Mathematics, Information Systems

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.