A formula for multiple classifiers in data mining based on Brandt semigroups. (English) Zbl 1168.94007
Classification of data plays one of the central roles in data mining and in practical applications of artificial intelligence methods in general (see [{\it J. L. Yearwood} and {\it M. A. Mammadov}, Classification technologies: optimization approaches to short text categorization. Tdea Group Inc. (2007)]). A well-known method of designing efficient multiple classifiers consists in representing them as several binary classifiers combined in one scheme. The main advantage of using combined multiple classifiers is that they can correct errors of individual binary classifiers and produce correct classifications despite individual classification errors. The problem of finding the number of errors of individual binary classifiers that a multiple classifier can correct in general is rather complicated. It is well-known that in full generality this problem is related to several other very difficult algorithmic problems (see [{\it J. L. Yearwood} and {\it M. A. Mammadov} [loc. cit.]). This note uses semigroup rings to introduce additional structure on the class sets of multiple classifiers, which makes it possible to generate these sets with a small number of generators. In special case of Brandt semigroups and their subsemigroups the authors have obtained a fairly concise formula for the number of errors of binary classifiers, which can be corrected by the corresponding multiple classifiers. This formula is the main result of paper. Examples are given to show that the formula does not directly generalize to all inverse semigroups and other classes of semigroups.

94B05General theory of linear codes
68P99Theory of data
62P30Applications of statistics in engineering and industry
Full Text: DOI
