On Exponentially Consistency Of Linkage-Based Hierarchical Clustering Algorithm Using Kolmogrov-Smirnov Distance
Tiexing Wang, Yang Liu, Biao Chen
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 13:22
This paper focuses on performance analysis of linkage-based hierarchical agglomerative clustering algorithms for sequence clustering using the Kolmogrov-Smirnov distance. Data sequences are assumed to be generated from unknown continuous distributions. The goal is to group the data sequences whose underlying generative distributions belong to one cluster without a priori knowledge of both the underlying distributions as well as the number of clusters. Upper bounds on the clustering error probability are derived. The upper bounds help establish the fact that the error probability decays exponentially fast as the sequence length goes to infinity and the obtained error exponent bound has a simple form. Tighter upper bounds on the error probability of single-linkage and complete-linkage algorithms are derived by taking advantage of the simplified metric updating for these two special cases. Simulation results are provided to validate the analysis.