zbMATH — the first resource for mathematics

Noisy-or classifier. (English) Zbl 1160.68584
Summary: I discuss an application of a family of Bayesian network models - known as models of independence of causal influence (ICI) - to classification tasks with large numbers of attributes. An example of such a task is categorization of text documents, in which attributes are single words from the documents. The key that enabled application of the ICI models is their compact representation using a hidden variable. The issue of learning these classifiers by a computationally efficient implementation of the EM algorithm is addressed. Special attention is paid to the noisy-or model - probably the best-known example of an ICI model. The classification using the noisy-or model corresponds to a statistical method known as logistic discrimination. The correspondence is described. Tests of the noisy-or classifier on the Reuters data set show that, despite its simplicity, it has a competitive performance.

68T35 Theory of languages and software systems (knowledge-based systems, expert systems, etc.) for artificial intelligence
Full Text: DOI
[1] Friedman, Mach Learn 29 pp 131– (1997)
[2] . A new look at causal independence. In: Lopez de Mantaras R, Poole D, editors. Proc Tenth Conf on Uncertainty in AI; 1994. pp 286–292.
[3] A generalization of the Noisy-Or model. In: Heckerman D, Mamdani A, editors. Proc Ninth Conf on Uncertainty in AI; 1993. pp 208–215.
[4] Dìez, Int J Intell Syst 18 pp 165– (2003)
[5] Exploiting functional dependence in Bayesian network inference. In: Darwiche A, Friedman N, editors. Proc 18th Conf on Uncertainty in Artificial Intelligence (UAI 2002); 2002. pp 528–535.
[6] . Tensor rank-one decomposition of probability tables. Research Report DAR-UTIA 2005/26. Available at http://staff.utia.cas.cz/vomlel/rank-one-decomposition.pdf.
[7] Dempster, J Roy Stat Soc Ser B 39 pp 1– (1977)
[8] . The EM algorithm and extensions. New York: John Wiley; 1997. · Zbl 0882.62012
[9] Lauritzen, Comput Stat Data Anal 19 pp 191– (1995)
[10] Probabilistic reasoning in intelligent systems: Networks of plausible inference. San Francisco: Morgan Kaufmann; 1988.
[11] Logistic discrimination. In: , editors. Handbook of statistics, vol 2. Amsterdam: North-Holland; 1982. pp 169–191.
[12] Text categorization using hierarchical Bayesian networks classifiers. Master’s thesis, Aalborg University; 2002. Available at: http://www.cs.auc.dk/library.
[13] Weka 3.2–Data mining with open source machine learning software; 2002. Available at: http://www.cs.waikato.ac.nz/ml/weka/.
[14] Fast training of support vector machines using sequential minimal optimization. In: , , editors. Advances in kernel methods–Support vector learning. Cambridge, MA: MIT Press; 1998.
[15] van Houwelingen, J Roy Stat Soc Ser C Appl Stat 41 pp 191– (1992) · Zbl 0825.62593
[16] C4.5: Programs for machine learning. San Mateo, CA: Morgan Kaufmann; 1993.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.