×

Systems of sets such that each set properly intersects at most one other set – application to cluster analysis. (English) Zbl 1295.05247

Summary: A set \(X\) is said to properly intersect a set \(Y\) if none of the sets \(X\cap Y\), \(X\setminus Y\) and \(Y\setminus X\) is empty. In this paper, we consider collections of subsets such that each member of the collection properly intersects at most one other member. Such collections are hereafter called paired hierarchical collections. The two following combinatorial properties are investigated. First, any paired hierarchical collection is a set of intervals of at least one linear order defined on the ground set. Next, the maximum size of a paired hierarchical collection defined on an \(n\)-element set is \(\lfloor \frac{5}{2}(n-1)\rfloor\). The properties of these collections are also investigated from the cluster analysis point of view. In the framework of the general bijection defined by A. Batbedat [Metron 46, No. 1–4, 47–59 (1988; Zbl 0724.05055)] and the author [Eur. J. Comb. 21, No. 6, 727–743 (2000; Zbl 0957.05103)], we characterize the dissimilarities that are induced by weakly indexed paired hierarchical collections. Finally, we propose a proof of the so-called agglomerative paired hierarchical clustering (APHC) algorithm that extends the well-known AHC algorithm in order to allow that some clusters can be merged twice. An implementation and some illustrations of this algorithm and of a variant of it were presented by S. Chelcea et al. [in: Innovations in classification, data science, and information systems. Proceedings of the 27th annual conference of the Gesellschaft für Klassifikation e.V., Cottbus, Germany, March 12–14, 2003. Berlin: Springer. 3–10 (2005; Zbl 1296.62129); “Un nouvel algorithme de classification ascendante 2–3 hiérarchique”, in: Reconnaissance des formes et intelligence artificielle (RFIA 2004), Vol. 3, Toulouse, France, 2004. 1471–1480 (2004), http://www.laas.fr/rfia2004/actes/ARTICLES/388.pdf].

MSC:

05D05 Extremal set theory
62H30 Classification and discrimination; cluster analysis (statistical aspects)
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] Bandelt, H.-J.; Dress, A.W.M., Weak hierarchies associated with similarity measures: an additive clustering technique, Bull. math. biology, 51, 133-166, (1989) · Zbl 0666.62058
[2] Barthélemy, J.-P.; Brucker, F., NP-hard approximation problems in overlapping clustering, J. classification, 18, 159-183, (2001) · Zbl 1040.91084
[3] Batbedat, A., LES isomorphismes HTS et HTE (après la bijection de benzécri – johnson), Metron, 46, 47-59, (1988) · Zbl 0724.05055
[4] J.-P. Benzécri, L’Analyse des données: la Taxinomie, vol. 1, Dunod, Paris, 1973.
[5] P. Bertrand, Structural properties of pyramidal clustering, in DIMACS Series in Discrete Mathematics and Theoretical Computer Science, vol. 19, 1995, pp. 35-53. · Zbl 0814.62032
[6] Bertrand, P., Set systems and dissimilarities, European J. combin., 21, 727-743, (2000) · Zbl 0957.05103
[7] P. Bertrand, Set systems for which each set properly intersects at most one other set—Application to pyramidal clustering, cahier du Ceremade 0202, University of Paris-Dauphine, France, 2002.
[8] Bertrand, P.; Diday, E., Une généralisation des arbres hiérarchiques: LES représentations pyramidales, Rev. statist. appl., 38, 53-78, (1990)
[9] Chelcea, S.; Bertrand, P.; Trousse, B., A new agglomerative 2-3 hierarchical clustering algorithm, (), 3-10 · Zbl 1296.62129
[10] S. Chelcea, P. Bertrand, B. Trousse, Un Nouvel Algorithme de Classification Ascendante 2-3 Hiérarchique, in: Reconnaissance des Formes et Intelligence Artificielle (RFIA 2004), vol. 3, Toulouse, France, 2004, pp. 1471-1480. Available at \(\langle\)http://www.laas.fr/rfia2004/actes/ARTICLES/388.pdf⟩.
[11] Diatta, J.; Fichet, B., From apresjan hierarchies and bandelt – dress weak hierarchies to quasi-hierarchies, (), 111-118
[12] Diatta, J.; Fichet, B., Quasi-ultrametrics and their 2-ball hypergraphs, Discrete math., 192, 87-102, (1998) · Zbl 0951.54023
[13] E. Diday, Une représentation visuelle des classes empiétantes: les pyramides, Rapport de recherche I.N.R.I.A. no. 291, Rocquencourt, France, 1984.
[14] Diday, E., Orders and overlapping clusters in pyramids, (), 201-234
[15] Durand, C.; Fichet, B., One-to-one correspondences in pyramidal representation: a unified approach, (), 85-90
[16] Fichet, B., Data analysis: geometric and algebraic structures, (), 123-132 · Zbl 0671.62005
[17] Gaul, W.; Schader, M., Pyramidal classification based on incomplete dissimilarity data, J. classification, 11, 171-193, (1994) · Zbl 0808.62056
[18] Jain, A.K.; Dubes, R.C., Algorithms for clustering data, (1988), Prentice-Hall Englewood Cliffs, NJ · Zbl 0665.62061
[19] Jardine, N.; Sibson, R., Mathematical taxonomy, (1971), Wiley New York · Zbl 0322.62065
[20] Johnson, S.C., Hierarchical clustering schemes, Psychometrika, 32, 241-254, (1967) · Zbl 1367.62191
[21] Lance, G.N.; Willians, W.T., A generalized theory of classificatory sorting strategies. I. hierarchical systems, Comput. J., 1, 15-20, (1967)
[22] Robinson, W.S., A method for chronological ordering of archeological deposits, Amer. antiquity, 16, 4, 293-301, (1951)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.