Preprint
Article

This version is not peer-reviewed.

Autoencoder-Enhanced Hierarchical Mondrian Anonymization via Latent Representations

Submitted:

02 February 2026

Posted:

03 February 2026

You are already at the latest version

Abstract
Releasing structured microdata requires balancing utility and privacy under group-based disclosure risks. We propose AE-LRHMA, a hybrid anonymization framework that performs Mondrian-style hierarchical partitioning in an autoencoder-learned latent space and integrates local (k,e) -microaggregation. To explicitly control sensitive-value concentration and diversity within each equivalence class, we introduce a tunable constraint set consisting of k, a maximum sensitive proportion threshold, and an optional sensitive-entropy threshold (used as a hard gate when enabled and otherwise as a soft term in split scoring). The anonymized output is generated via standard interval/set generalization in the original space. Experiments on Adult and Bank Marketing demonstrate that AE-LRHMA yields lower information loss and more stable group structures than representative baselines under comparable settings. We further report linkage-attack-oriented risk metrics to empirically characterize relative disclosure trends, without claiming formal guarantees such as differential privacy.
Keywords: 
;  ;  ;  ;  ;  
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2026 MDPI (Basel, Switzerland) unless otherwise stated