×

Isotonic boosting classification rules. (English) Zbl 07363875

Summary: In many real classification problems a monotone relation between some predictors and the classes may be assumed when higher (or lower) values of those predictors are related to higher levels of the response. In this paper, we propose new boosting algorithms, based on LogitBoost, that incorporate this isotonicity information, yielding more accurate and easily interpretable rules. These algorithms are based on theoretical developments that consider isotonic regression. We show the good performance of these procedures not only on simulations, but also on real data sets coming from two very different contexts, namely cancer diagnostic and failure of induction motors.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
PDFBibTeX XMLCite
Full Text: DOI Link

References:

[1] Agresti, A., Categorical data analysis (2002), Hoboken: Wiley, Hoboken · Zbl 1018.62002 · doi:10.1002/0471249688
[2] Agresti, A., Analysis of ordinal categorical data (2010), Hoboken: Wiley, Hoboken · Zbl 1263.62007 · doi:10.1002/9780470594001
[3] Allwein, EL; Schapire, RE; Singer, Y., Reducing multiclass to binary: a unifying approach for margin classifiers, J Mach Learn Res, 1, 113-141 (2000) · Zbl 1013.68175
[4] Auh, S.; Sampson, AR, Isotonic logistic discrimination, Biometrika, 93, 4, 961-972 (2006) · Zbl 1436.62237 · doi:10.1093/biomet/93.4.961
[5] Barlow, RE; Bartholomew, DJ; Bremner, JM; Brunk, HD, Statistical inference under order restrictions (1972), New York: Wiley, New York · Zbl 0246.62038
[6] Bühlmann P (2012) Bagging, boosting and ensemble methods. In: Handbook of computational statistics, Springer. Chapter, vol 33, pp 985-1022
[7] Cano, JR; García, S., Training set selection for monotonic ordinal classification, Data Knowl Eng, 112, 94-105 (2017) · doi:10.1016/j.datak.2017.10.003
[8] Cano, JR; Gutiérrez, PA; Krawczyk, B.; Wozniak, M.; García, S., Monotonic classification: an overview on algorithms, performance measures and data sets, Neurocomputing, 341, 168-182 (2019) · doi:10.1016/j.neucom.2019.02.024
[9] Chen, Y.; Samworth, RJ, Generalized additive and index models with shape constraints, J R Stat Soc B, 78, 729-754 (2016) · Zbl 1414.62153 · doi:10.1111/rssb.12137
[10] Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the ACM SIGKDD international conference on knowledge discovery and data mining, pp 785-794
[11] Chen T, He T, Benesty M, Khotilovich V, Tang Y, Cho H, Chen K, Mitchell R, Cano I, Zhou T, Li M, Xie J, Lin M, Geng Y, Li Y (2019) xgboost: Extreme Gradient Boosting. R package version 0.82.1 https://CRAN.R-project.org/package=xgboost
[12] Choudhary, A.; Goyal, D.; Shimi, SL; Akula, A., Condition monitoring and fault diagnosis of induction motors: a review, Arch Comput Methods Eng, 1, 2 (2019) · doi:10.1007/s11831-018-9286-z
[13] Conde, D.; Fernández, MA; Rueda, C.; Salvador, B., Classification of samples into two or more ordered populations with application to a cancer trial, Stat Med, 31, 28, 3773-3786 (2012) · doi:10.1002/sim.5476
[14] Conde, D.; Salvador, B.; Rueda, C.; Fernández, MA, Performance and estimation of the true error rate of classification rules built with additional information: an application to a cancer trial, Stat Appl Gen Mol Biol, 12, 5, 583-602 (2013)
[15] Conde, D.; Fernández, MA; Salvador, B.; Rueda, C., dawai: an R package for discriminant analysis with additional information, J Stat Softw, 66, 10, 1-19 (2015) · doi:10.18637/jss.v066.i10
[16] Conde D, Fernández MA, Rueda C, Salvador B (2020) isoboost: isotonic Boosting Classification Rules. R package version 1.0.0 https://CRAN.R-project.org/package=isoboost
[17] De Leeuw, J.; Hornik, K.; Mair, P., Isotone optimization in R: pool-adjacent-violators algorithm (PAVA) and active set methods, J Stat Softw, 32, 5, 1-24 (2009) · doi:10.18637/jss.v032.i05
[18] Dettling, M.; Bühlmann, P., Boosting for tumor classification with gene expression data, Bioinformatics, 19, 9, 1061-1069 (2003) · doi:10.1093/bioinformatics/btf867
[19] Dietterich, TG, An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization, Mach Learn, 40, 2, 139-157 (2000) · doi:10.1023/A:1007607513941
[20] Fang, Z.; Meinshausen, N., LASSO isotone for high-dimensional additive isotonic regression, J Comput Graph Stat, 21, 1, 72-91 (2012) · doi:10.1198/jcgs.2011.10095
[21] Fernández, MA; Rueda, C.; Salvador, B., Incorporating additional information to normal linear discriminant rules, J Am Stat Assoc, 101, 569-577 (2006) · Zbl 1119.62340 · doi:10.1198/016214505000001041
[22] Freund, Y.; Schapire, RE, A decision-theoretic generalization of on-line learning and an application to boosting, J Comput Syst Sci, 55, 1, 119-139 (1997) · Zbl 0880.68103 · doi:10.1006/jcss.1997.1504
[23] Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: ICML’96 Proceedings of the thirteenth international conference on international conference on machine learning, pp 148-156
[24] Friedman, J.; Hastie, T.; Tibshirani, R., Additive logistic regression: a statistical view of boosting, Ann Stat, 28, 2, 337-407 (2000) · Zbl 1106.62323 · doi:10.1214/aos/1016218223
[25] Fullerton, AS; Anderson, KF, The role of job insecurity in explanations of racial health inequalities, Sociol Forum, 28, 2, 308-325 (2013) · doi:10.1111/socf.12020
[26] Fullerton, AS; Xu, J., Ordered regression models: parallel, partial, and non-parallel alternatives (2016), Boca Raton: CRC Press, Boca Raton · Zbl 1358.62013 · doi:10.1201/b20060
[27] Garcia-Escudero, LA; Duque-Perez, O.; Fernandez-Temprano, M.; Moriñigo-Sotelo, D., Robust detection of incipient faults in VSI-fed induction motors using quality control charts, IEEE Trans Ind Appl, 53, 3, 3076-3085 (2017) · doi:10.1109/TIA.2016.2617300
[28] Gauchat, G., The cultural authority of science: public trust and acceptance of organized science, Public Understand Sci, 20, 6, 751-770 (2011) · doi:10.1177/0963662510365246
[29] Ghosh, D., Incorporating monotonicity into the evaluation of a biomarker, Biostatistics, 8, 2, 402-413 (2007) · Zbl 1144.62091 · doi:10.1093/biostatistics/kxl018
[30] Halaby, CN, Worker attachment and workplace authority, Am Sociol Rev, 51, 5, 634-649 (1986) · doi:10.2307/2095489
[31] Hand, DJ; Till, RJ, A simple generalisation of the area under the ROC curve for multiple class classication problems, Mach Learn, 45, 171-186 (2001) · Zbl 1007.68180 · doi:10.1023/A:1010920819831
[32] Härdle, W.; Hall, P., On the backfitting algorithm for additive regression models, Stat Neerl, 47, 43-57 (1993) · Zbl 0764.62054 · doi:10.1111/j.1467-9574.1993.tb01405.x
[33] Hastie T, Tibshirani R (2014) Generalized additive models. In: Wiley StatsRef: Statistics Reference Online. Wiley-Interscience. doi:10.1002/9781118445112.stat03141 · Zbl 0645.62068
[34] Hofner, B.; Kneib, T.; Hothorn, T., A unified framework of constrained regression, Stat Comput, 26, 1-2, 1-14 (2016) · Zbl 1342.62115 · doi:10.1007/s11222-014-9520-y
[35] Holmes G, Pfahringer B, Kirkby R, Frank E, Hall M (2002) Multiclass alternating decision trees. In: European conference on machine learning. Springer, Berlin · Zbl 1014.68754
[36] Jarek Tuszynski (2019) caTools: tools: moving window statistics, GIF, Base64, ROC, AUC, etc. R package version 1.17.1.2 https://CRAN.R-project.org/package=caTools
[37] Liaw, A.; Wiener, M., Classification and Regression by random, Forest R News, 2, 3, 18-22 (2002)
[38] Marshall, RJ, Classification to ordinal categories using a search partition methodology with an application in diabetes screening, Stat Med, 18, 2723-2735 (1999) · doi:10.1002/(SICI)1097-0258(19991030)18:20<2723::AID-SIM234>3.0.CO;2-1
[39] Masters, GN, A Rasch model for partial credit scoring, Psychometrika, 47, 149-174 (1982) · Zbl 0493.62094 · doi:10.1007/BF02296272
[40] McDonald R, Hand D, Eckley I (2003) An empirical comparison of three boosting algorithms on real data sets with artificial class noise. In: MSC2003: multiple classifier systems, pp 35-44 · Zbl 1040.68702
[41] Mease, D.; Wyner, A., Evidence contrary to the statistical view of boosting, J Mach Learn Res, 9, 131-156 (2008)
[42] Meyer, MC, Semi-parametric additive constrained regression, J Nonparametr Stat, 25, 3, 715-730 (2013) · Zbl 1416.62223 · doi:10.1080/10485252.2013.797577
[43] Meyer D, Dimitriadou E, Hornik K, Weingessel A, Leisch F (2019) e1071: Misc functions of the department of statistics, probability theory group (Formerly: E1071), TU Wien. R package version 1.7-1 https://CRAN.R-project.org/package=e1071
[44] Pya, N.; Wood, SN, Shape constrained additive models, Stat Comput, 25, 3, 543-559 (2014) · Zbl 1331.62367 · doi:10.1007/s11222-013-9448-7
[45] R Core Team (2019) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
[46] Robertson, T.; Wright, FT; Dykstra, R., Order restricted statistical inference (1988), New York: Wiley, New York · Zbl 0645.62028
[47] Schapire, RE, The strength of weak learnability, Mach Learn, 5, 2, 197-227 (1990)
[48] Sobel, ME; Becker, MP; Minick, SM, Origins, destinations, and association in occupational mobility, Am J Sociol, 104, 3, 687-721 (1998) · doi:10.1086/210084
[49] Therneau T, Atkinson B (2019) rpart: recursive partitioning and regression trees. R package version 4.1-15 https://CRAN.R-project.org/package=rpart
[50] Turner R (2019). Iso: functions to perform isotonic regression. R package version 0.0-18 https://CRAN.R-project.org/package=Iso
[51] Venables, WN; Ripley, BD, Modern applied statistics with S (2002), New York: Springer, New York · Zbl 1006.62003 · doi:10.1007/978-0-387-21706-2
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.