×

Looking inside self-organizing map ensembles with resampling and negative correlation learning. (English) Zbl 1217.68186

Summary: We focus on the problem of training ensembles or, more generally, a set of self-organizing maps (SOMs). In the light of new theory behind ensemble learning, in particular negative correlation learning (NCL), the question arises if SOM ensemble learning can benefit from non-independent learning when the individual learning stages are interlinked by a term penalizing correlation in errors. We can show that SOMs are well suited as weak ensemble components with a small number of neurons. Using our approach, we obtain efficiently trained SOM ensembles outperforming other reference learners. Due to the transparency of SOMs, we can give insights into the interrelation between diversity and sublocal accuracy inside SOMs. We are able to shed light on the diversity arising over a combination of several factors: explicit versus implicit as well as inter-diversities versus intra-diversities. NCL fully exploits the potential of SOM ensemble learning when the single neural networks co-operate at the highest level and stability is satisfied. The reported quantified diversities exhibit high correlations to the prediction performance.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Asuncion, A., & Newman, D. J. (2007). UCI machine learning repository.; Asuncion, A., & Newman, D. J. (2007). UCI machine learning repository.
[2] Breiman, L. (2001). Random forests. In Machine learning; Breiman, L. (2001). Random forests. In Machine learning · Zbl 1007.68152
[3] Breiman, L. (1996). Bagging predictors. In Machine learning; Breiman, L. (1996). Bagging predictors. In Machine learning · Zbl 0858.68080
[4] Brown, G. (2004). Diversity in neural network ensembles. Ph.D. thesis; Brown, G. (2004). Diversity in neural network ensembles. Ph.D. thesis
[5] Brown, G.; Wyatt, J.; Harris, R.; Yao, X., Diversity creation methods: a survey and categorisation, Journal of Information Fusion, 6, 5-20 (2005)
[6] Brown, G.; Wyatt, J.; Sun, P., Between two extremes: examining decompositions of the ensemble objective function, (International workshop on multiple classifier systems. International workshop on multiple classifier systems, LNCS, Vol. 3541 (2005))
[7] Brown, G.; Wyatt, J. L.; Kaelbling, P., Managing diversity in regression ensembles, Journal of Machine Learning Research, 6 (2005) · Zbl 1222.68154
[8] Cleveland, W. S.; Devlin, S. J., Locally-weighted regression: an approach to regression analysis by local fitting, Journal of the American Statistical Association, 83, 596-610 (1988) · Zbl 1248.62054
[9] Cortez, P., & Morais, A. (2007). A data mining approach to predict forest fires using meteorological data. In J. Neves, M. F. Santos & J. Machado (Eds.) New trends in artificial intelligence, proceedings of the 13th EPIA 2007—Portuguese conference on artificial intelligence; Cortez, P., & Morais, A. (2007). A data mining approach to predict forest fires using meteorological data. In J. Neves, M. F. Santos & J. Machado (Eds.) New trends in artificial intelligence, proceedings of the 13th EPIA 2007—Portuguese conference on artificial intelligence
[10] Dietterich, T. G., The handbook of brain theory and neural networks, (Arbib, M. A., Ensemble learning (2002), The MIT Press: The MIT Press Cambridge, MA), 405-408, (Chapter)
[11] Dimitriadou, E., Hornik, K., Leisch, F., Meyer, D., & Weingessel, A. (2009). E1071: misc functions of the department of statistics (e1071). TU Wien. R package version 1.5-19.; Dimitriadou, E., Hornik, K., Leisch, F., Meyer, D., & Weingessel, A. (2009). E1071: misc functions of the department of statistics (e1071). TU Wien. R package version 1.5-19.
[12] Dittenbach, M.; Merkl, D.; Rauber, A., The growing hierarchical self-organizing map, (IJCNN (6) (2000), IEEE Computer Society), 15-19
[13] Freund, Y.; Schapire, R. E., Experiments with a new boosting algorithm, (Proceedings of the thirteenth international conference on machine learning (1996), Morgan Kaufmann), 148-156
[14] Friedman, J. H., Multivariate adaptive regression splines, The Annals of Statistics, 19, 1-67 (1991) · Zbl 0765.62064
[15] Fritzke, B., Fast learning with incremental RBF networks, (Neural Processing Letters (1994)), 2-5
[16] Geman, S.; Bienenstock, E.; Doursat, R., Neural networks and the bias/variance dilemma, Neural Computation, 4, 1-58 (1992)
[17] Goerke, N.; Kintzler, F.; Eckmiller, R., Self organized partitioning of chaotic attractors for control, (ICANN’01: proceedings of the international conference on artificial neural networks (2001), Springer-Verlag: Springer-Verlag London, UK), 851-856 · Zbl 1001.68783
[18] Hastie, T.; Tibshirani, R.; Friedman, J., The elements of statistical learning: data mining, inference, and prediction (2001), Springer · Zbl 0973.62007
[19] Ho, T. K., The random subspace method for constructing decision forests, IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 832-844 (1998)
[20] Inoue, H.; Narihisa, H., Effective pruning method for a multiple classifier system based on self-generating neural networks, (Artificial neural networks and neural information processing—ICANN/ICONIP 2003, Vol. 2714 (2003), Springer: Springer Berlin, Heidelberg), 11-18 · Zbl 1037.68658
[21] Kawashima, S.; Ogata, H.; Kanehisa, M., AAindex: amino acid index database, Nucleic Acids Research, 27, 368-369 (1999)
[22] Kohonen, T., Self-organized formation of topologically correct feature maps, (Biological Cybernetics, Vol. 43 (1982)), 59-69 · Zbl 0466.92002
[23] Krogh, A.; Vedelsby, J., Neural network ensembles, cross validation, and active learning, (Adv. in NIPS (1995), MIT Press), 231-238
[24] Liaw, A.; Wiener, M., Classification and regression by randomforest, R News, 2, 18-22 (2002)
[25] Liu, Y.; Yao, X., Ensemble learning via negative correlation, Neurocomputing, 12, 1399-1404 (1999)
[26] Miikkulainen, R., Script recognition with hierarchical feature maps, Connection Science, 2, 83-101 (1990)
[27] Millington, P. J., & Baker, W. L. (1990). Associative reinforcement learning for optimal control. In Proc. conf. on AIAA guid. nav. and cont. Vol. 2; Millington, P. J., & Baker, W. L. (1990). Associative reinforcement learning for optimal control. In Proc. conf. on AIAA guid. nav. and cont. Vol. 2
[28] Minku, F. L.; Inoue, H.; Yao, X., Negative correlation in incremental learning, Natural Computing, 8, 289-320 (2009) · Zbl 1188.68231
[29] Prudhomme, E., & Lallich, S. (2008). Optimization of self-organizing maps ensemble in prediction. In International conference on data mining. DMIN’08; Prudhomme, E., & Lallich, S. (2008). Optimization of self-organizing maps ensemble in prediction. In International conference on data mining. DMIN’08
[30] R Development Core Team (2008). R: a language and environment for statistical computing. R Foundation for Stat. Comp. Austria. ISBN: 3-900051-07-0.; R Development Core Team (2008). R: a language and environment for statistical computing. R Foundation for Stat. Comp. Austria. ISBN: 3-900051-07-0.
[31] Ritter, H., Learning with the self-organizing map, (Kohonen, T.; etal., Artificial neural networks (1991), Elsevier Science Publishers), 379-384
[32] Scherbart, A. (2009). Lerranco: SOM ensembles. R package version 0.1.; Scherbart, A. (2009). Lerranco: SOM ensembles. R package version 0.1.
[33] Scherbart, A.; Nattkemper, T. W., The diversity of regression ensembles combining bagging and random subspace method, (Köppen, M.; Kasabov, N. K.; Coghill, G. G., ICONIP (2). ICONIP (2), Lecture notes in computer science, Vol. 5507 (2008), Springer), 911-918
[34] Scherbart, A., Timm, W., Böcker, S., & Nattkemper, T. W. (2007). Som-based peptide prototyping for mass spectrometry peak intensity prediction. In WSOM’07; Scherbart, A., Timm, W., Böcker, S., & Nattkemper, T. W. (2007). Som-based peptide prototyping for mass spectrometry peak intensity prediction. In WSOM’07
[35] Schölkopf, B.; Bartlett, P.; Smola, A.; Williamson, R., Shrinking the tube: a new support vector regression algorithm, (Advances in Neural Information Processing Systems (1999))
[36] Timm, W.; Scherbart, A.; Böcker, S.; Kohlbacher, O.; Nattkemper, T. W., Peak intensity prediction in MALDI-TOF mass spectrometry: a machine learning study to support quantitative proteomics, BMC Bioinformatics, 9, 443 (2008)
[37] Vlachos, P. (2005). Statlib datasets archive.; Vlachos, P. (2005). Statlib datasets archive.
[38] Wehrens, R., & Mevik, B. -H. (2007). PLS: Partial Least Squares Regression (PLSR) and Principal Component Regression (PCR). R package version 2.1-0.; Wehrens, R., & Mevik, B. -H. (2007). PLS: Partial Least Squares Regression (PLSR) and Principal Component Regression (PCR). R package version 2.1-0.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.