Hybrid clustering of data and vague concepts based on labels semantics.

*(English)*Zbl 1425.68353Summary: Data clustering is the process of dividing data elements into clusters so that items in the same cluster are as similar as possible, and items in different clusters are as dissimilar as possible. One of the key features for clustering is how to define a sensible similarity measure. Such measures usually handle data in one modality, but unable to cluster data from different modalities. Based on fuzzy set and prototype theory interpretations of label semantics, two (dis) similarity measures are proposed by which we can automatically cluster data and vague concepts represented by logical expressions of linguistic labels. Experimental results on a toy problem and one in image classification demonstrate the effectiveness of new clustering algorithms. Since our new proposed measures can be extended to measuring distance between any two granularities, the new clustering algorithms can also be extended to cluster data instance and imprecise concepts represented by other granularities.

##### MSC:

68T05 | Learning and adaptive systems in artificial intelligence |

68T37 | Reasoning under uncertainty in the context of artificial intelligence |

##### Keywords:

label semantics; linguistic labels; logical expressions; K-means; imprecise concept clustering##### Software:

LFOIL
PDF
BibTeX
XML
Cite

\textit{Z. Qin} et al., Ann. Oper. Res. 256, No. 2, 393--416 (2017; Zbl 1425.68353)

Full Text:
DOI

##### References:

[1] | Beg, M. M. S., Thint, M., & Qin, Z. (2007). PNL-enhanced restricted domain question answering system. The Proceedings of IEEE-FUZZ, 1277-1283. · Zbl 1085.68695 |

[2] | Bezdek, J. (1981). Pattern recognition with fuzzy objective function algorithms. ISBN 0-306-40671-3. · Zbl 0503.68069 |

[3] | Carneiro, G; Chan, AB; Moreno, PJ; Vasconcelos, N, Supervised learning of semantic classes for image annotation and retrieval, IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 394-410, (2006) |

[4] | Chakraborty, C; Chakraborty, D, A theoretical development on a fuzzy distance measure for fuzzy numbers, Mathematical and Computer Modelling, 43, 254-261, (2006) · Zbl 1132.03027 |

[5] | Deng, Z; Jiang, Y; Chung, F-L; Ishibuchi, H; Choi, K-S; Wang, S, Transfer prototype-based fuzzy clustering, IEEE Transactions on Fuzzy Systems, 24, 1210-1232, (2016) |

[6] | Diamond, P, Fuzzy least squares, Information Sciences, 46, 141-157, (1988) · Zbl 0663.65150 |

[7] | Dunn, JC, A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters, Journal of Cybernetics, 3, 32-57, (1973) · Zbl 0291.68033 |

[8] | Ghosh, S; Kumar Dubey, S, Comparative analysis of K-means and fuzzy C-means algorithms, International Journal of Advanced Computer Science and Applications, 4, 35-39, (2013) |

[9] | Hyung, LK; Song, YS; Lee, KM, Similarity measure between fuzzy sets and between elements, Fuzzy Sets and System, 62, 291-293, (1994) |

[10] | Jain, AK, Data clustering: 50 years beyond K-means, Pattern Recognition Letters, 31, 651-666, (2010) |

[11] | Lawry, J, A framework for linguistic modeling, Artificial Intelligence, 155, 1-39, (2004) · Zbl 1085.68695 |

[12] | Lawry, J. (2006). Modelling and reasoning with vague concepts. Berlin: Springer. · Zbl 1092.68095 |

[13] | Lawry, J; Tang, Y, Uncertainty modelling for vague concepts: A prototype theory approach, Artificial Intelligence, 173.18, 1539-1558, (2009) · Zbl 1185.68710 |

[14] | Li, D-F, Some measures of dissimilarity in intuitionistic fuzzy structures, Journal of Computer and System Sciences, 8, 115-122, (2004) · Zbl 1052.03034 |

[15] | Lavrenko, V; Manmatha, R; Jeon, J, A model for learning the semantics of pictures, Advances in Neural Information Processing Systems, 16, 553-560, (2004) |

[16] | MacQueen, J. B. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of 5th Berkeley symposium on mathematical statistics and probability (pp. 281-297). University of California Press. · Zbl 0214.46201 |

[17] | Miyamoto, S. (1990). Fuzzy sets in information retrieval and cluster analysis. Dordrecht: Kluwer Academic Publishers. · Zbl 0716.68030 |

[18] | Pedrycz, W. (2005). Knowledge-based clustering. Hoboken: Wiley. · Zbl 1100.68096 |

[19] | Qin, Z; Lawry, J, Decision tree learning with fuzzy labels, Information Sciences, 172, 91-129, (2005) · Zbl 1087.68094 |

[20] | Qin, Z; Lawry, J, LFOIL: linguistic rule induction in the label semantics framework, Fuzzy Sets and Systems, 159, 435-448, (2008) · Zbl 1176.68164 |

[21] | Qin, Z., & Tang, Y. (2014). Uncertainty modeling for data mining: A label semantics approach. Berlin: Springer. · Zbl 1301.68006 |

[22] | Qin, Z; Thint, M; Beg, MMS, Deduction engine designs for PNL-based question answering systems, Foundations of Fuzzy Logic and Soft Computing, LNAI 4529, 253-262, (2007) |

[23] | Talavera, L; Bejar, J, Generality-based conceptual clustering with probabilistic concepts, IEEE Transactions on Pattern Analysis and Machine Intelligence, 23, 196-206, (2001) |

[24] | Yang, K; Ko, C-H, On cluster-wise fuzzy regression analysis, IEEE Transaction on Systems, Man and Cybernetics B, 27, 1-13, (1997) |

[25] | Yong, Y; Chongxun, Z; Pan, L, A novel fuzzy C-means clustering algorithm for image thresholding, Measurement Science Review, 4, 11-19, (2004) |

[26] | Zadeh, LA, The concept of linguistic variable and its application to approximate reasoning part 2, Information Science, 8, 301-357, (1975) · Zbl 0404.68074 |

[27] | Zadeh, LA, Fuzzy logic \(=\) computing with words, IEEE Transaction on Fuzzy Systems, 4, 103-111, (1996) |

[28] | Zadeh, L. A. (2012). Computing with words: Principal concepts and ideas. Studies in fuzziness and soft computing. Berlin: Springer. · Zbl 1267.68238 |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.