Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

A Fast K-prototypes Algorithm Using Partial Distance Computation

Version 1 : Received: 17 April 2017 / Approved: 17 April 2017 / Online: 17 April 2017 (11:08:26 CEST)

A peer-reviewed article of this Preprint also exists.

Kim, B. A Fast K-prototypes Algorithm Using Partial Distance Computation. Symmetry 2017, 9, 58. Kim, B. A Fast K-prototypes Algorithm Using Partial Distance Computation. Symmetry 2017, 9, 58.

Abstract

The k-means is one of the most popular and widely used clustering algorithm, however, it is limited to only numeric data. The k-prototypes algorithm is one of the famous algorithms for dealing with both numeric and categorical data. However, there have been no studies to accelerate k-prototypes algorithm. In this paper, we propose a new fast k-prototypes algorithm that gives the same answer as original k-prototypes. The proposed algorithm avoids distance computations using partial distance computation. Our k-prototypes algorithm finds minimum distance without distance computations of all attributes between an object and a cluster center, which allows it to reduce time complexity. A partial distance computation uses a fact that a value of the maximum difference between two categorical attributes is 1 during distance computations. If data objects have m categorical attributes, maximum difference of categorical attributes between an object and a cluster center is m. Our algorithm first computes distance with only numeric attributes. If a difference of the minimum distance and the second smallest with numeric attributes is higher than m, we can find minimum distance between an object and a cluster center without distance computations of categorical attributes. The experimental shows proposed k-prototypes algorithm improves computational performance than original k-prototypes algorithm in our dataset.

Keywords

clustering algorithm; k-prototypes algorithm, partial distance computation

Subject

Computer Science and Mathematics, Computer Science

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.