Improving classification when a class hierarchy is available using a hierarchy-based prior. (English) Zbl 1331.62316

Summary: We introduce a new method for building classification models when we have prior knowledge of how the classes can be arranged in a hierarchy, based on how easily they can be distinguished. The new method uses a Bayesian form of the multinomial logit (MNL, a.k.a. “softmax”) model, with a prior that introduces correlations between the parameters for classes that are nearby in the tree. We compare the performance on simulated data of the new method, the ordinary MNL model, and a model that uses the hierarchy in a different way. We also test the new method on page layout analysis and document classification problems, and find that it performs better than the other methods.


62H30 Classification and discrimination; cluster analysis (statistical aspects)
62J05 Linear regression; mixed models
Full Text: DOI arXiv Euclid