Preprint
Article

Unsupervised Feature Selection for Histogram-Valued Symbolic Data by Hierarchical Conceptual Clustering

Altmetrics

Downloads

158

Views

287

Comments

0

A peer-reviewed article of this preprint also exists.

This version is not peer-reviewed

Submitted:

30 March 2021

Posted:

31 March 2021

You are already at the latest version

Alerts
Abstract
This paper presents an unsupervised feature selection method for multi-dimensional histogram-valued data. We define a multi-role measure, called the compactness, based on the concept size of given objects and/or clusters described by a fixed number of equal probability bin-rectangles. In each step of clustering, we agglomerate objects and/or clusters so as to minimize the compactness for the generated cluster. This means that the compactness plays the role of a similarity measure between objects and/or clusters to be merged. To minimize the compactness is equivalent to maximize the dis-similarity of the generated cluster, i.e., concept, against the whole concept in each step. In this sense, the compactness plays the role of cluster quality. We also show that the average compactness of each feature with respect to objects and/or clusters in several clustering steps is useful as feature effectiveness criterion. Features having small average compactness are mutually covariate, and are able to detect geometrically thin structure embedded in the given multi-dimensional histogram-valued data. We obtain thorough understandings of the given data by the visualization using dendrograms and scatter diagrams with respect to the selected informative features. We illustrate the effectiveness of the proposed method by using an artificial data set and real histogram-valued data sets.
Keywords: 
Subject: Computer Science and Mathematics  -   Algebra and Number Theory
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated