The duality diagram in data analysis: examples of modern applications. (English) Zbl 1234.62006

Summary: Today’s data-heavy research environment requires the integration of different sources of information into structured data sets that can not be analyzed as simple matrices. We introduce an old technique, known in the European data analyses circles as the Duality Diagram Approach, put to new uses through the use of a variety of metrics and ways of combining different diagrams together. This issue of this journal contains contemporary examples of how this approach provides solutions to hard problems in data integration. We present here the genesis of the technique and how it can be seen as a precursor of the modern kernel based approaches.


62A01 Foundations and philosophical topics in statistics
62A99 Foundational topics in statistics


Guerry; ade4; R
Full Text: DOI arXiv


[1] Baty, F., Facompré, M., Wiegand, J., Schwager, J. and Brutsche, M. (2006). Analysis with respect to instrumental variables for the exploration of microarray data structures. BMC Bioinformatics 7 422.
[2] Baty, F., Jaeger, D., Preiswerk, F., Schumacher, M. and Brutsche, M. (2008). Stability of gene contributions and identification of outliers in multivariate analysis of microarray data. BMC Bioinformatics 9 289.
[3] Benzécri, J.-P. (1973). L’analyse des données: Leçons sur l’analyse factorielle et la reconnaissance des formes, et travaux du Laboratoire de statistique de l’Université de Paris VI . Dunod, Paris.
[4] Cailliez, F. and Pages, J. P. (1976). Introduction à l’analyse des données . SMASH, Paris.
[5] Chessel, D., Dufour, A. and Thioulouse, J. (2004). The ade4 package, I: One-table methods. R News 4 5-10.
[6] Culhane, A., Perriere, G., Considine, E., Cotter, T. and Higgins, D. (2002). Between-group analysis of microarray data. Bioinformatics 18 1600.
[7] Culhane, A., Perrière, G. and Higgins, D. (2003). Cross-platform comparison and visualisation of gene expression data using co-inertia analysis. BMC Bioinformatics 4 59.
[8] Dray, S. and Dufour, A. (2007). The ade4 package: Implementing the duality diagram for ecologists. J. Statist. Softw. 22 6.
[9] Dray, S., Dufour, A. and Chessel, D. (2007). The ade4 package-II: Two-table and k -table methods. R News 7 (2) 47-52.
[10] Dray, S. and Jombart, T. (2011). Revisiting Guerry’s data: Introducing spatial constraints in multivariate analysis. Ann. Appl. Statist. 5 2278-2299. · Zbl 1234.62092
[11] Escoufier, Y. (1980). L’analyse conjointe de plusieurs matrices de données. In Biométrie et Temps (E. Jolivet, ed.) 59-76. Societe Francaise de Biométrie, Paris.
[12] Escoufier, Y. (2006). Operator related to a data matrix: A survey. In COMPSTAT 2006-Proceedings in Computational Statistics 285-297. Physica, Heidelberg. · Zbl 0367.62078
[13] Fagan, A., Culhane, A. and Higgins, D. (2007). A multivariate analysis approach to the integration of proteomic and gene expression data. Proteomics 7 2162-2171.
[14] Gifi, A. (1990). Nonlinear Multivariate Analysis . Wiley, Chichester. · Zbl 0697.62048
[15] Golub, G. H. and Van Loan, C. F. (1996). Matrix Computations , 3rd ed. Johns Hopkins Univ. Press, Baltimore, MD. · Zbl 0865.65009
[16] Holmes, S. (2006). Multivariate data analysis: The French way. In Probability and Statistics: Essays in Honor of David A. Freedman (D. Nolan and T. Speed, eds.) 219-233. IMS, Beachwood, OH. · Zbl 1166.62310
[17] Ihaka, R. and Gentleman, R. (1996). R: A language for data analysis and graphics. J. Comput. Graph. Statist. 5 299-314.
[18] Mardia, K. V., Kent, J. T. and Bibby, J. M. (1979). Multivariate Analysis . Academic Press, London. · Zbl 0432.62029
[19] Purdom, E. (2011). Analysis of a data matrix and a graph: Metagenomic data and the phylogenetic tree. Ann. Appl. Statist. 5 2326-2358. · Zbl 1234.62148
[20] Rao, C. R. (1964). The use and interpretation of principal component analysis in applied research. Sankhyā A 26 329-359. · Zbl 0137.37207
[21] Schölkopf, B., Smola, A. and Muller, K.-R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10 1299-1319.
[22] Schölkopf, B., Tsuda, K. and Vert, J.-P. (2004). Kernel Methods in Computational Biology . MIT Press, Cambridge, MA.
[23] Shinkareva, S., Mason, R., Malave, V., Wang, W., Mitchell, T. and Just, M. (2008). Using fMRI brain activation to identify cognitive states associated with perception of tools and dwellings. PLoS One 3 e1394.
[24] Thioulouse, J. (2011). Simultaneous analysis of a sequence of paired ecological tables: A comparison of several methods. Ann. Appl. Statist. 5 2300-2325. · Zbl 1234.62154
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.