zbMATH — the first resource for mathematics

Stability of density-based clustering. (English) Zbl 1283.62130
Summary: High density clusters can be characterized by the connected components of a level set \(L(\lambda) = \{x: p(x)>\lambda \}\) of the underlying probability density function \(p\) generating the data, at some appropriate level \(\lambda \geq 0\). The complete hierarchical clustering can be characterized by a cluster tree \(\mathcal T= \cup_{\lambda }L(\lambda)\). In this paper, we study the behavior of a density level set estimate \(\widehat L(\lambda )\) and cluster tree estimate \(\widehat T\) based on a kernel density estimator with kernel bandwidth \(h\). We define two notions of instability to measure the variability of \(\widehat L(\lambda )\) and \(\widehat {\mathcal T}\) as a function of \(h\), and investigate the theoretical properties of these instability measures.

62H30 Classification and discrimination; cluster analysis (statistical aspects)
68T05 Learning and adaptive systems in artificial intelligence
Full Text: Link arXiv