Unsupervised Hierarchical Visual Taxonomy of Marble Natural Stone Using Cluster-Aware Self-Supervised Vision Transformers

Margarida Tânger de Oliveira Figueiredo; Carlos M. A. Diogo; Gustavo Paneiro; Pedro Amaral; António Alves de Campos

doi:10.20944/preprints202603.1344.v1

Submitted:

16 March 2026

Posted:

17 March 2026

You are already at the latest version

Abstract

The marble industry relies on proprietary commercial names rather than objective visual categories, creating market inefficiencies for stakeholders who select stones based on appearance. Supervised classification methods perpetuate this problem by replicating inconsistent commercial labels instead of discovering intrinsic visual structure. We propose an unsupervised pipeline combining a two-stage training strategy, pure self-supervised pretraining followed by cluster-aware fine-tuning of a DINO Vision Transformer, with UMAP dimensionality reduction and Ward's agglomerative hierarchical clustering. Systematic ablation studies on 1,540 marble images spanning 10 commercial varieties validate each design choice: cluster-aware training at k=10 yields superior embeddings over the self-supervised baseline (Silhouette Score 0.778 vs. 0.761; Davies–Bouldin Index 0.293 vs. 0.364), UMAP compression to five dimensions resolves high-dimensional noise pathologies, and Ward's linkage produces the most compact partitions. The resulting taxonomy reveals three phenomena invisible to commercial classification: cross-category merging of visually indistinguishable stones carrying different market names, intra-category splitting of heterogeneous sub-populations within single varieties, and coherent grouping where commercial and visual boundaries coincide. We further demonstrate that standard extrinsic metrics are misaligned with unsupervised taxonomy objectives when reference labels encode the inconsistencies the method aims to resolve. This work provides a validated methodology for data-driven visual classification in the natural stone industry and a transferable template for domains with unreliable labelling conventions.

Keywords:

self-supervised learning

;

vision transformer

;

DINO

;

deep clustering

;

hierarchical clustering

;

marble classification

;

unsupervised visual taxonomy

Subject:

Computer Science and Mathematics - Artificial Intelligence and Machine Learning

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

Unsupervised Hierarchical Visual Taxonomy of Marble Natural Stone Using Cluster-Aware Self-Supervised Vision Transformers

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe