zbMATH — the first resource for mathematics

An empirical analysis of binary transformation strategies and base algorithms for multi-label learning. (English) Zbl 07255755
Summary: Investigating strategies that are able to efficiently deal with multi-label classification tasks is a current research topic in machine learning. Many methods have been proposed, making the selection of the most suitable strategy a challenging issue. From this premise, this paper presents an extensive empirical analysis of the binary transformation strategies and base algorithms for multi-label learning. This subset of strategies uses the one-versus-all approach to transform the original data, generating one binary data set per label, upon which any binary base algorithm can be applied. Considering that the influence of the base algorithm on the predictive performance obtained by the strategies has not been considered in depth by many empirical studies, we investigated the influence of distinct base algorithms on the performance of several strategies. Thus, this study covers a family of multi-label strategies using a diversified range of base algorithms, exploring their relationship over different perspectives. This finding has significant implications concerning the methodology of evaluation adopted in multi-label experiments containing binary transformation strategies, given that multiple base algorithms should be considered. Despite these improvements in strategy and base algorithms, for many data sets, a large number of labels, mainly those less frequent, were either never predicted, or always misclassified. We conclude the experimental analysis by recommending strategies and base algorithms in accordance with different performance criteria.

68T05 Learning and adaptive systems in artificial intelligence
Full Text: DOI
[1] Alali, A.; Kubat, M., PruDent: A pruned and confident stacking approach for multi-label classification, IEEE Transactions on Knowledge and Data Engineering, 27, 9, 2480-2493 (2015)
[2] Benavoli, A.; Corani, G.; Demsar, J.; Zaffalon, M., Time for a change: A tutorial for comparing multiple classifiers through bayesian analysis, Journal of Machine Learning Research, 18, 77:1-77:36 (2017) · Zbl 1440.62237
[3] Bernardini, FC; Benito, E.; Meza, M., Cardinality and density measures and their influence to multi-label learning methods, Journal of the Brazilian Society on Computational Intelligence, 12, 1, 53-71 (2014)
[4] Boutell, MR; Luo, J.; Shen, X.; Brown, CM, Learning multi-label scene classification, Pattern Recognition, 37, 9, 1757-1771 (2004)
[5] Breiman, L., Random forests, Machine Learning, 45, 1, 5-32 (2001) · Zbl 1007.68152
[6] Briggs, F., Huang, Y., Raich, R., Eftaxias, K., Lei, Z., Cukierski, W., Hadley, S. F., et al. (2013). The 9th annual MLSP competition: New methods for acoustic classification of multiple simultaneous bird species in a noisy environment. In IEEE International workshop on machine learning for signal processing (pp. 1-8). 10.1109/MLSP.2013.6661934.
[7] Chang, CC; Lin, CJ, LIBSVM: A library for support vector machines, ACM Transactions on Intelligent Systems and Technology, 2, 27:1-27:27 (2011)
[8] Charte, F., Rivera, A. J., del Jesus, M. J., & Herrera, F. (2015). QUINTA: A question tagging assistant to improve the answering ratio in electronic forums. In IEEE international conference on computer as a tool, IEEE (pp. 1-6). 10.1109/EUROCON.2015.7313677.
[9] Charte, F.; Charte, FD, Working with multilabel datasets in R: The mldr Package, The R Journal, 7, 2, 149-162 (2015)
[10] Charte, F.; Rivera, AJ; Charte, D.; del Jesús, MJ; Herrera, F., Tips, guidelines and tools for managing multi-label datasets: The mldr.datasets R package and the cometa data repository, Neurocomputing, 289, 68-85 (2018)
[11] Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22Nd ACM international conference on knowledge discovery and data mining (pp. 785-794). 10.1145/2939672.2939785.
[12] Cherman, EA; Metz, J.; Monard, MC, Incorporating label dependency into the binary relevance framework for multi-label classification, Expert Systems with Applications, 39, 2, 1647-1655 (2012)
[13] Cherman, EA; Spolaôr, N.; Valverde-Rebaza, J.; Monard, MC, Lazy multi-label learning algorithms based on mutuality strategies, Journal of Intelligent & Robotic Systems (2014)
[14] de Carvalho, ACPLF; Freitas, AA; Abraham, A.; Hassanien, AE; Snášel, V., A tutorial on multi-label classification techniques, Foundations of computational intelligence, 177-195 (2009), Berlin: Springer, Berlin
[15] de Sá, A. G. C., Freitas, A. A., & Pappa, G. L. (2018). Automated selection and configuration of multi-label classification algorithms with grammar-based genetic programming. In A. Auger, C. M. Fonseca, N. Lourenço, P. Machado, L. Paquete, D. Whitley (Eds.), Parallel Problem Solving from Nature - PPSN XV−15th international conference, Coimbra, Portugal, September 8-12, 2018, Proceedings, Part II, Springer, Lecture Notes in Computer Science (Vol. 11102, pp. 308-320). 10.1007/978-3-319-99259-4_25.
[16] de Sá, A. G. C., Pappa, G. L., & Freitas, A. A. (2017). Towards a method for automatically selecting and configuring multi-label classification algorithms. In Proceedings of the genetic and evolutionary computation conference companion (pp. 1125-1132) 10.1145/3067695.3082053.
[17] Duygulu, P., Barnard, K., de Freitas, J. F. G., & Forsyth, D. A. (2002). Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In A. Heyden, G. Sparr, M. Nielsen, P. Johansen (Eds.), Computer Vision—ECCV 2002, 7th European conference on computer vision, Copenhagen, Denmark, May 28-31, 2002, Proceedings, Part IV, Lecture Notes in Computer Science (Vol. 2353, pp. 97-112). Berlin: Springer. 10.1007/3-540-47979-1_7. · Zbl 1039.68623
[18] Elisseeff, A., & Weston, J. (2001). A kernel method for multi-labeled classification. In Proceedings of the neural information processing systems (pp. 681-687).
[19] Gelman, A.; Hill, J., Data analysis using regression and multilevel/hierarchical models. Analytical methods for social research (2007), New York: Cambridge University Press, New York
[20] Gibaja, E.; Ventura, S., Multi-label learning: A review of the state of the art and ongoing research, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 4, 6, 411-444 (2014)
[21] Gibaja, E.; Ventura, S., A tutorial on multilabel learning, ACM Computing Surveys, 47, 3, 1-38 (2015)
[22] Godbole, S., & Sarawagi, S. (2004). Discriminative methods for multi-labeled classification. In Proceedings of the 8th Pacific-Asia conference, (pp. 22-30) 10.1007/978-3-540-24775-3_5.
[23] Gonçalves, E. C., Plastino, A., & Freitas, A. A. (2013). A genetic algorithm for optimizing the label ordering in multi-label classifier chains. In Proceedings of the international conference on tools with artificial intelligence (pp. 469-476). 10.1109/ICTAI.2013.76.
[24] Jackson, P.; Moulinier, I., Natural language processing for online applications: Text retrieval, extraction & categorization (2002), Amsterdam: John Benjamins, Amsterdam
[25] Jain, AK; Dubes, RC, Algorithms for clustering data (1988), Upper Saddle River, NJ: Prentice-Hall Inc, Upper Saddle River, NJ
[26] Joachims, T., Text categorization with support vector machines: Learning with many relevant features, Proceedings of the 10th European Conference on Machine Learning, 1398, 137-142 (1998)
[27] Klimt, B., & Yang, Y. (2004). The Enron Corpus: A new dataset for email classification research. In Proceedings of the 15th European conference on Machine Learning (pp. 217-226) 10.1007/978-3-540-30115-8_22. · Zbl 1132.68562
[28] Lang, K. (1995). Newsweeder: Learning to filter Netnews. In Proceedings of the twelfth international conference on machine learning, (pp. 331-339).
[29] Li, Y. k., & Zhang, M. L. (2014). Enhancing binary relevance for multi-label learning with controlled label correlations exploitation. In 13th Pacific Rim International Conference on Artificial Intelligence (pp. 91-103). 10.1007/978-3-319-13560-1_8.
[30] Liu, SM; Chen, J., An empirical study of empty prediction of multi-label classification, Expert Syst Appl, 42, 13, 5567-5579 (2015)
[31] Luaces, O.; Díez, J.; Barranquero, J.; del Coz, JJ; Bahamonde, A., Binary relevance efficacy for multilabel classification, Progress in Artificial Intelligence, 1, 4, 303-313 (2012)
[32] Madjarov, G.; Kocev, D.; Gjorgjevikj, D.; Džeroski, S., An extensive experimental comparison of methods for multi-label learning, Pattern Recognition, 45, 9, 3084-3104 (2012)
[33] Mantovani, R. G., Rossi, A. L. D., Vanschoren, J., Bischl, B., & Carvalho, A. C. P. L. F. (2015). To tune or not to tune: Recommending when to adjust SVM hyper-parameters via meta-learning. In 2015 International Joint Conference on Neural Networks, IEEE, (pp. 1-8). 10.1109/IJCNN.2015.7280644.
[34] Metz, J., de Abreu, L. F., Cherman, E. A., & Monard, M. C. (2012). On the estimation of predictive evaluation measure baselines for multi-label learning. In 13th Ibero-American Conference on Artificial Intelligence (pp. 189-198).
[35] Montañes, E.; Senge, R.; Barranquero, J.; Quevedo, JR; Coz, JJd; Hüllermeier, E., Dependent binary relevance models for multi-label classification, Pattern Recognition, 47, 3, 1494-1508 (2014)
[36] Moyano, JM; Galindo, ELG; Cios, KJ; Ventura, S., Review of ensembles of multi-label classifiers: Models, experimental study and prospects, Information Fusion, 44, 33-45 (2018)
[37] Pereira, RB; Plastino, A.; Zadrozny, B.; Merschmann, LH, Correlation analysis of performance measures for multi-label classification, Information Processing & Management, 54, 3, 359-369 (2018)
[38] Pestian, J. P., Brew, C., Matykiewicz, P., Hovermale, D. J., Johnson, N., Cohen, K. B., & Duch, W. (2007). A shared task involving multi-label classification of clinical free text. In Proceedings of the workshop on biological, translational, and clinical language processing, association for computational linguistics (pp. 97-104).
[39] Quinlan, JR, C4.5: Programs for Machine Learning (1993), San Francisco, CA: Morgan Kaufmann Publishers Inc., San Francisco, CA
[40] Raez, A. M., Lopez, L. A. U., Steinberger, R. (2004). Adaptive selection of base classifiers in one-against-all learning for large multi-labeled collections. In Advances in Natural Language Processing (pp. 1-12). 10.1007/978-3-540-30228-5_1.
[41] Rauber, T. W., Mello, L. H., Rocha, V. F., Luchi, D., & Varejão, F. M. (2014). Recursive dependent binary relevance model for multi-label classification. In A. L. Bazzan, K. Pichara (Eds), Advances in artificial intelligence—IBERAMIA 2014 (pp. 206-217). 10.1007/978-3-319-12027-0_17.
[42] Read, J.; Pfahringer, B.; Holmes, G.; Frank, E., Classifier chains for multi-label classification, Proceedings of the European Conference, Bled, Slovenia, 5782, 254-269 (2009)
[43] Read, J.; Pfahringer, B.; Holmes, G.; Frank, E., Classifier chains for multi-label classification, Machine Learning, 85, 3, 333-359 (2011)
[44] Rivolli, A., & de Carvalho, A. C. P. L. F. (2018). The utiml Package: Multi-label Classification in R. The R Journalhttps://journal.r-project.org/archive/2018/RJ-2018-041/index.html.
[45] Rivolli, A.; Soares, C.; de Carvalho, ACPLF, Enhancing multilabel classification for food truck recommendation, Expert Systems (2018)
[46] Schapire, ER; Singer, Y., Improved boosting algorithm using confidence-rated predictions, Machine Learning, 37, 3, 297-336 (1999) · Zbl 0945.68194
[47] Sechidis, K., Tsoumakas, G., & Vlahavas, I. (2011). On the stratification of multi-label data. In D. Gunopulos, T. Hofmann, D. Malerba, Vazirgiannis M. (Eds.), Machine learning and knowledge discovery in databases (pp. 145-158). 10.1007/978-3-642-23808-6_10.
[48] Senge, R., del Coz, J. J., & Hüllermeier, E. (2013). Rectifying classifier chains for multi-label classification. In Proceedings of the Workshop of Lernen, Wissen & Adaptivität, Bamberg, Germany (pp. 162-169).
[49] Snoek, C. G. M., Worring, M., van Gemert, J. C., Geusebroek, J. M., & Smeulders, A. W. M. (2006). The challenge problem for automated detection of 101 semantic concepts in multimedia. In Proceedings of the 14th ACM international conference on multimedia, (pp. 421-430) 10.1145/1180639.1180727.
[50] Srivastava, A. N., & Zane-Ulman, B. (2005). Discovering recurring anomalies in text reports regarding complex space systems. In IEEE aerospace conference (pp. 3853-3862). 10.1109/AERO.2005.1559692.
[51] Trohidis, K.; Tsoumakas, G.; Kalliris, G.; Vlahavas, I., Multi-label classification of music by emotion, Journal on Audio, Speech, and Music Processing, 2011, 1, 4 (2011)
[52] Tsoumakas, G., Katakis, I., & Vlahavas, I. (2008). Effective and efficient multilabel classification in domains with large number of labels. In Proceedings of European conference on machine learning and principles and practice of knowledge discovery in databases, workshop on mining multidimensional data (pp. 30-44).
[53] Tsoumakas, G., Loza Mencía, E., Katakis, I., Park, S. H., & Fürnkranz, J. (2009). On the combination of two decompositive multi-label classification methods. In Proceedings of the European conference on machine learning and principles and practice of knowledge discovery, workshop on preference learning (pp. 114-129).
[54] Tsoumakas, G.; Katakis, I., Multi-label classification: An overview, International Journal of Data Warehousing and Mining, 3, 3, 1-13 (2007)
[55] Tsoumakas, G.; Katakis, I.; Vlahavas, I.; Maimon, O.; Rokach, L., Mining multi-label data, Data mining and knowledge discovery handbook, Chap 34, 667-685 (2010), Berlin: Springer, Berlin
[56] Tsoumakas, G.; Katakis, I.; Vlahavas, I., Random k-labelsets for multi-label classification, IEEE Transactions on Knowledge and Data Engineering, 23, 7, 1079-1089 (2011)
[57] Tsoumakas, G.; Katakis, I.; Vlahavas, I., Random k-labelsets for multilabel classification, IEEE Transactions on Knowledge and Data Engineering, 23, 7, 1079-1089 (2011)
[58] Turnbull, D.; Barrington, L.; Torres, D.; Lanckriet, G., Semantic annotation and retrieval of music and sound effects, IEEE Transactions on Audio, Speech, and Language Processing, 16, 2, 467-476 (2008)
[59] Wever, M., Mohr, F., & Hüllermeier, E. (2018). Automated multi-label classification based on ML-plan. arXiv:1811.04060. · Zbl 06990191
[60] Wever, M. D., Mohr, F., Tornede, A., & Hüllermeier, E. (2019). Automating multi-label classification extending ml-plan. In 6th ICML Workshop on Automated Machine Learning.
[61] Wolpert, DH, Stacked generalization, Neural Networks, 5, 2, 241-259 (1992)
[62] Yang, Y., An evaluation of statistical approaches to text categorization, Information Retrieval, 1, 1-2, 69-90 (1999)
[63] Zhang, ML; Wu, L., Lift: Multi-Label learning with label-specific features, IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 1, 107-120 (2015)
[64] Zhang, ML; Zhou, ZH, A review on multi-label learning algorithms, IEEE Transactions on Knowledge and Data Engineering, 26, 8, 1819-1837 (2014)
[65] Zhou, Z., & Zhang, M. (2006). Multi-instance multi-label learning with application to scene classification. In B. Schölkopf, J. C. Platt, & T. Hofmann (Eds.), Advances in neural information processing systems 19, Proceedings of the twentieth annual conference on neural information processing systems, Vancouver, British Columbia, December 4-7, 2006, (pp. 1609-1616). Cambridge: MIT Press.
[66] Zhou, T.; Tao, D.; Wu, X., Compressed labeling on distilled labelsets for multi-label learning, Machine Learning, 88, 1-2, 69-126 (2012) · Zbl 1243.68259
[67] Zufferey, D.; Hofer, T.; Hennebert, J.; Schumacher, M.; Ingold, R.; Bromuri, S., Performance comparison of multi-label learning algorithms on clinical data for chronic diseases, Computers in Biology and Medicine, 65, 34-43 (2015)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.