Version 1
: Received: 28 November 2019 / Approved: 3 December 2019 / Online: 3 December 2019 (03:34:19 CET)
How to cite:
Hirai, S.; Yamanishi, K. Kernel Complexity for Nonparametric Distributions and Detection of Its Changes. Preprints2019, 2019110381. https://doi.org/10.20944/preprints201911.0381.v1
Hirai, S.; Yamanishi, K. Kernel Complexity for Nonparametric Distributions and Detection of Its Changes. Preprints 2019, 2019110381. https://doi.org/10.20944/preprints201911.0381.v1
Hirai, S.; Yamanishi, K. Kernel Complexity for Nonparametric Distributions and Detection of Its Changes. Preprints2019, 2019110381. https://doi.org/10.20944/preprints201911.0381.v1
APA Style
Hirai, S., & Yamanishi, K. (2019). Kernel Complexity for Nonparametric Distributions and Detection of Its Changes. Preprints. https://doi.org/10.20944/preprints201911.0381.v1
Chicago/Turabian Style
Hirai, S. and Kenji Yamanishi. 2019 "Kernel Complexity for Nonparametric Distributions and Detection of Its Changes" Preprints. https://doi.org/10.20944/preprints201911.0381.v1
Abstract
This paper addresses the issues of how we can quantify structural information for nonparametric distributions and how we can detect its changes. Structural information refers to an index for a global understanding of a data distribution. When we consider the problem of clustering using a parametric model such as a Gaussian mixture model, the number of mixture components (clusters) can be thought of as structural information in the model. However, there does not exist any notion of structural information for nonparametric modeling of data. In this paper we introduce a novel notion of {\em kernel complexity} (KC) as structural information in the nonparametric setting. The key idea of KC is to combine the information bias inspired by the Gini index with the information quantity measured in terms of the normalized maximum likelihood (NML) code length. We empirically show that KC has a property similar to the number of clusters in a parametric model. We further propose a framework for structural change detection with KC in nonparametric distributions. With synthetic and real data sets we empirically demonstrate that our framework enables us to detect structural changes underlying the data and their early warning signals.
Keywords
change-point detection; kernel density estimation; normalized maximum likelihood (NML); minimum description length (MDL) principle; kernel complexity (KC)
Subject
Computer Science and Mathematics, Probability and Statistics
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.