×

On the usage of joint diagonalization in multivariate statistics. (English) Zbl 1493.62318

Summary: Scatter matrices generalize the covariance matrix and are useful in many multivariate data analysis methods, including well-known principal component analysis (PCA), which is based on the diagonalization of the covariance matrix. The simultaneous diagonalization of two or more scatter matrices goes beyond PCA and is used more and more often. In this paper, we offer an overview of many methods that are based on a joint diagonalization. These methods range from the unsupervised context with invariant coordinate selection and blind source separation, which includes independent component analysis, to the supervised context with discriminant analysis and sliced inverse regression. They also encompass methods that handle dependent data such as time series or spatial data.

MSC:

62H12 Estimation in multivariate analysis
62H25 Factor analysis and principal components; correspondence analysis
62H30 Classification and discrimination; cluster analysis (statistical aspects)
62H99 Multivariate analysis
62M10 Time series, auto-correlation, regression, etc. in statistics (GARCH)
62M40 Random fields; image analysis
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Adali, T.; Anderson, M.; Fu, G.-S., Diversity in independent component and vector analyses: Identifiability, algorithms, and applications in medical imaging, IEEE Signal Process. Mag., 31, 18-33 (2014)
[2] Alashwali, F.; Kent, J. T., The use of a common location measure in the invariant coordinate selection and projection pursuit, J. Multivariate Anal., 152, 145-161 (2016) · Zbl 1348.62184
[3] Anderson, T., An Introduction To Multivariate Statistical Analysis (2003), Wiley: Wiley New York · Zbl 1039.62044
[4] Archimbaud, A.; May, J.; Nordhausen, K.; Ruiz-Gazen, A., : ICS via a shiny application (2018), R package version 0.5
[5] Archimbaud, A.; Nordhausen, K.; Ruiz-Gazen, A., ICS for multivariate outlier detection with application to quality control, Comput. Statist. Data Anal., 128, 184-199 (2018) · Zbl 1469.62016
[6] Archimbaud, A.; Nordhausen, K.; Ruiz-Gazen, A., Unsupervized outlier detection with, R Journal, 10, 1, 234-250 (2018)
[7] Bachoc, F.; Genton, M. G.; Nordhausen, K.; Ruiz-Gazen, A.; Virta, J., Spatial blind source separation, Biometrika, 107, 627-646 (2020) · Zbl 1451.62052
[8] Belouchrani, A.; Abed Meraim, K.; Cardoso, J.-F.; Moulines, E., A blind source separation technique based on second order statistics, IEEE Trans. Signal Process., 45, 434-444 (1997)
[9] Bilodeau, M.; Brenner, D., Theory of Multivariate Statistics (2008), Springer: Springer New York
[10] Bura, E.; Cook, R., Extending sliced inverse regression: The weighted chi-squared test, J. Amer. Statist. Assoc., 96, 996-1003 (2001) · Zbl 1047.62035
[11] Bura, E.; Yang, J., Dimension estimation in sufficient dimension reduction: A unifying approach, J. Multivariate Anal., 102, 1, 130-142 (2011) · Zbl 1206.62107
[12] Cardoso, J.-F., Source separation using higher order moments, (Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (1989), IEEE), 2109-2112
[13] Cardoso, J.-F.; Souloumiac, A., Jacobi angles for simultaneous diagonalization, SIAM J. Matrix Anal. Appl., 17, 161-164 (1996) · Zbl 0844.65028
[14] Caussinus, H.; Fekri, M.; Hakam, S.; Ruiz-Gazen, A., A monitoring display of multivariate outliers, Comput. Statist. Data Anal., 44, 1, 237-252 (2003) · Zbl 1429.62217
[15] Caussinus, H.; Ruiz, A., Interesting projections of multidimensional data by means of generalized principal component analyses, (Momirović, K.; Mildner, V., Compstat (1990), Physica-Verlag HD: Physica-Verlag HD Heidelberg), 121-126
[16] Caussinus, H.; Ruiz-Gazen, A., Classification and generalized principal component analysis, (Brito, P.; Cucumel, G.; Bertrand, P.; de Carvalho, F., Selected Contributions in Data Analysis and Classification (2007), Springer: Springer Berlin), 539-548 · Zbl 1181.68110
[17] Chabriel, G.; Kleinsteuber, M.; Moreau, E.; Shen, H.; Tichavsky, P.; Yeredor, A., Joint matrices decompositions and blind source separation: A survey of methods, identification, and applications, IEEE Signal Process. Mag., 31, 3, 34-43 (2014)
[18] Choi, S.; Cichocki, A., Blind separation of nonstationary and temporally correlated sources from noisy mixtures, (Neural Networks for Signal Processing X. Proceedings of the 2000 IEEE Signal Processing Society Workshop, Vol. 1 (2000), IEEE), 405-414
[19] Choi, S.; Cichocki, A., Blind separation of nonstationary sources in noisy mixtures, Electron. Lett., 36, 848-849 (2000)
[20] Cichocki, A.; Amari, S.-I., Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications (2002), John Wiley & Sons: John Wiley & Sons New York
[21] Clarkson, D., A least squares version of algorithm AS 211: The F-G diagonalization algorithm, Appl. Stat., 37, 317-321 (1988)
[22] Comon, P.; Jutten, C., Handbook of Blind Source Separation: Independent Component Analysis and Applications (2010), Academic Press: Academic Press Oxford
[23] Cook, R., SAVE: A method for dimension reduction and graphics in regression, Comm. Statist. Theory Methods, 29, 2109-2121 (2000) · Zbl 1061.62503
[24] Cook, R. D., A slice of multivariate dimension reduction, J. Multivariate Anal., Article 104812 pp. (2021), (online first)
[25] Critchley, F.; Pires, A.; Amado, C., Principal Axis AnalysisTechnical Report (2006), The Open University Milton Keynes
[26] Croux, C.; Haesbroeck, G., Principal component analysis based on robust estimators of the covariance or correlation matrix: influence functions and efficiencies, Biometrika, 87, 3, 603-618 (2000) · Zbl 0956.62047
[27] Fekri, M.; Ruiz-Gazen, A., A B-robust non-iterative scatter matrix estimator: Asymptotics and application to cluster detection using invariant coordinate selection, (Nordhausen, K.; Taskinen, S., Modern Nonparametric, Robust and Multivariate Methods: Festschrift in Honour of Hannu Oja (2015), Springer International Publishing: Springer International Publishing Cham), 395-423
[28] Fischer, D.; Honkatukia, M.; Tuiskula-Haavisto, M.; Nordhausen, K.; Cavero, D.; Preisinger, R.; Vilkki, J., Subgroup detection in genotype data using invariant coordinate selection, BMC Bioinformatics, 18, 173-181 (2017)
[29] Fischer, D.; Nordhausen, K.; Oja, H., On linear dimension reduction based on diagonalization of scatter matrices for bioinformatics downstream analyses, Heliyon, 6, Article e05732 pp. (2020)
[30] Fisher, R., The use of multiple measurements in taxonomic problems, Ann. Eugen., 7, 2, 179-188 (1936)
[31] Flury, B., Common Principal Components & Related Multivariate Models (1988), John Wiley & Sons: John Wiley & Sons Chichester · Zbl 1081.62535
[32] Flury, B. N.; Gautschi, W., An algorithm for simultaneous orthogonal transformation of several positive definite symmetric matrices to nearly diagonal form, SIAM J. Sci. Stat. Comput., 7, 1, 169-184 (1986) · Zbl 0614.65043
[33] Hotelling, H., Relations between two sets of variates, Biometrika, 28, 321-377 (1936) · Zbl 0015.40705
[34] Huber, P., Projection pursuit, Ann. Statist., 13, 435-475 (1985) · Zbl 0595.62059
[35] Huber, P.; Ronchetti, E., Robust Statistics (2011), Wiley: Wiley Hoboken
[36] Illner, K.; Miettinen, J.; Fuchs, C.; Taskinen, S.; Nordhausen, K.; Oja, H.; Theis, F., Model selection using limiting distributions of second-order blind source separation algorithms, Signal Process., 113, 95-103 (2015)
[37] Ilmonen, P.; Nevalainen, J.; Oja, H., Characteristics of multivariate distributions and the invariant coordinate system, Statist. Probab. Lett., 80, 23, 1844-1853 (2010) · Zbl 1202.62068
[38] Ilmonen, P.; Nordhausen, K.; Oja, H.; Theis, F., An affine equivariant robust second-order BSS method, (Vincent, E.; Yeredor, A.; Koldovský, Z.; Tichavský, P., Latent Variable Analysis and Signal Separation. LVA/ICA 2015. Lecture Notes in Computer Science, Vol. 9237 (2015), Springer: Springer Cham), 328-335
[39] Ilmonen, P.; Oja, H.; Serfling, R., On invariant coordinate system (ICS) functionals, Internat. Statist. Rev., 80, 93-110 (2012) · Zbl 1422.62175
[40] Jolliffe, I., Principal Component Analysis (2002), Springer: Springer New York · Zbl 1011.62064
[41] Kankainen, A.; Taskinen, S.; Oja, H., Tests of multinormality based on location vectors and scatter matrices, Stat. Methods Appl., 16, 357-379 (2007) · Zbl 1405.62062
[42] Li, K.-C., Sliced inverse regression for dimension reduction, J. Amer. Statist. Assoc., 86, 414, 316-327 (1991) · Zbl 0742.62044
[43] Li, K.-C., On principal hessian directions for data visualization and dimension reduction: Another application of Stein’s lemma, J. Amer. Statist. Assoc., 87, 420, 1025-1039 (1992) · Zbl 0765.62003
[44] Li, B., Sufficient Dimension Reduction Methods and Applications with R (2018), Chapman and Hall/CRC: Chapman and Hall/CRC Boca Raton · Zbl 1408.62011
[45] Li, B.; Wang, S., On directional regression for dimension reduction, J. Amer. Statist. Assoc., 102, 479, 997-1008 (2007) · Zbl 1469.62300
[46] Liski, E.; Nordhausen, K.; Oja, H., Supervised invariant coordinate selection, Statistics, 4, 711-731 (2014) · Zbl 1326.62092
[47] Loperfido, N., Some theoretical properties of two kurtosis matrices, with application to invariant coordinate selection, J. Multivariate Anal., Article 104809 pp. (2021) · Zbl 1476.62140
[48] Luo, W.; Li, B., Combining eigenvalues and variation of eigenvectors for order determination, Biometrika, 103, 4, 875-887 (2016) · Zbl 1506.62304
[49] Luo, W.; Li, B., On order determination by predictor augmentation, Biometrika, 108, 557-574 (2021) · Zbl 07459716
[50] Ma, Y.; Zhu, L., A review on dimension reduction, Internat. Statist. Rev., 81, 1, 134-150 (2013) · Zbl 1416.62220
[51] Mardia, K.; Kent, J.; Bibby, J., Multivariate Analysis (1979), Academic Press: Academic Press London · Zbl 0432.62029
[52] Maronna, R. A., Robust M-estimators of multivariate location and scatter, Ann. Statist., 51-67 (1976) · Zbl 0322.62054
[53] Maronna, R. A.; Martin, R. D.; Yohai, V. J.; Salibián-Barrera, M., Robust statistics: Theory and methods (with R) (2019), John Wiley & Sons: John Wiley & Sons New York · Zbl 1409.62009
[54] Maronna, R. A.; Yohai, V. J., Robust estimation of multivariate location and scatter, (Balakrishnan, N.; Colton, T.; Everitt, B.; Piegorsch, W.; Ruggeri, F.; Teugels, J., Wiley StatsRef: Statistics Reference Online (2016), Wiley), 1-12
[55] Matilainen, M.; Croux, C.; Nordhausen, K.; Oja, H., Supervised dimension reduction for multivariate time series, Econometr. Stat., 4, 57-69 (2017)
[56] Matilainen, M.; Croux, C.; Nordhausen, K.; Oja, H., Sliced average variance estimation for multivariate time series, Statistics, 53, 630-655 (2019) · Zbl 1419.62239
[57] Matilainen, M.; Nordhausen, K.; Oja, H., New independent component analysis tools for time series, Statist. Probab. Lett., 105, 80-87 (2015) · Zbl 1396.62214
[58] Miettinen, J., Alternative diagonality criteria for SOBI, (Nordhausen, K.; Taskinen, S., Modern Nonparametric, Robust and Multivariate Methods: Festschrift in Honour of Hannu Oja (2015), Springer: Springer Cham), 455-469 · Zbl 1336.62003
[59] Miettinen, J.; Illner, K.; Nordhausen, K.; Oja, H.; Taskinen, S.; Theis, F., Separation of uncorrelated stationary time series using autocovariance matrices, J. Time Series Anal., 37, 337-354 (2016) · Zbl 1381.62250
[60] Miettinen, J.; Matilainen, M.; Nordhausen, K.; Taskinen, S., Extracting conditionally heteroskedastic components using independent component analysis, J. Time Series Anal., 41, 293-311 (2020) · Zbl 1447.62104
[61] Miettinen, J.; Nordhausen, K.; Oja, H.; Taskinen, S., Statistical properties of a blind source separation estimator for stationary time series, Statist. Probab. Lett., 82, 11, 1865-1873 (2012) · Zbl 1312.62110
[62] Miettinen, J.; Nordhausen, K.; Oja, H.; Taskinen, S., Deflation-based separation of uncorrelated stationary time series, J. Multivariate Anal., 123, 214-227 (2014) · Zbl 1278.62147
[63] Miettinen, J.; Nordhausen, K.; Taskinen, S., Blind source separation based on joint diagonalization in R: The packages and, J. Stat. Softw., 76, 1-31 (2017)
[64] Miettinen, J.; Taskinen, S.; Nordhausen, K.; Oja, H., Fourth moments and independent component analysis, Statist. Sci., 30, 3, 372-390 (2015) · Zbl 1332.62196
[65] Muehlmann, C.; Bachoc, F.; Nordhausen, K., Spatial nonstationary source separation (2021), Arxiv
[66] Muehlmann, C.; Fačevicová, K.; Gardlo, A.; Janečková, H.; Nordhausen, K., Independent component analysis for compositional data, (Daouia, A.; Ruiz-Gazen, A., Advances in Contemporary Statistics and Econometrics: Festschrift in Honor of Christine Thomas-Agnan (2021), Springer: Springer Cham), 525-545 · Zbl 07645418
[67] Muehlmann, C.; Nordhausen, K.; Oja, H., Sliced inverse regression for spatial data, (Bura, E.; Li, B., Festschrift in Honor of R. Dennis Cook: Fifty Years of Contribution To Statistical Science (2021), Springer: Springer Cham), 87-107
[68] Muehlmann, C.; Nordhausen, K.; Virta, J., : Blind source separation for multivariate spatial data (2021), R package version 0.11-0
[69] Muehlmann, C.; Nordhausen, K.; Yi, M., On cokriging, neural networks, and spatial blind source separation for multivariate spatial prediction, IEEE Geosci. Remote Sens. Lett., 1-5 (2020)
[70] Nordhausen, K., On robustifying some second order blind source separation methods for nonstationary time series, Statist. Papers, 55, 1, 141-156 (2014) · Zbl 1308.62176
[71] Nordhausen, K.; Fischer, G.; Filzmoser, P., Blind source separation for compositional time series, Math. Geosci., 53, 905-924 (2021) · Zbl 1472.86027
[72] Nordhausen, K.; Gutch, H. W.; Oja, H.; Theis, F. J., Joint diagonalization of several scatter matrices for ICA, (Theis, F.; Cichocki, A.; Yeredor, A.; Zibulevsky, M., Latent Variable Analysis and Signal Separation: 10th International Conference (2012), Springer: Springer Berlin), 172-179
[73] Nordhausen, K.; Matilainen, M.; Miettinen, J.; Virta, J.; Taskinen, S., Dimension reduction for time series in a blind source separation context using R, J. Stat. Softw., 98, 1-30 (2021)
[74] Nordhausen, K.; Oja, H., Scatter matrices with independent block property and ISA, (2011 19th European Signal Processing Conference (2011), IEEE), 1738-1742
[75] Nordhausen, K.; Oja, H., Independent component analysis: A statistical perspective, WIREs: Comput. Stat., 10, Article e1440 pp. (2018)
[76] Nordhausen, K.; Oja, H.; Filzmoser, P.; Reimann, C., Blind source separation for spatial compositional data, Math. Geosci., 47, 7, 753-770 (2015) · Zbl 1323.86031
[77] Nordhausen, K.; Oja, H.; Ollila, E., Robust independent component analysis based on two scatter matrices, Aust. J. Stat., 37, 91-100 (2008)
[78] Nordhausen, K.; Oja, H.; Ollila, E., Multivariate models and the first four moments, (Hunter, D. R.; Richards, D. S.R.; Rosenberger, J. L., Nonparametric Statistics and Mixture Models (2011), World Scientific: World Scientific Hackensack), 267-287 · Zbl 1414.62171
[79] Nordhausen, K.; Oja, H.; Tyler, D. E., On the efficiency of invariant multivariate sign and rank test, (Liski, E. P.; Isotalo, J.; Niemelä, J.; Puntanen, S.; Styan, G. P.H., Festschrift for Tarmo Pukkila on His 60th Birthday (2006), University of Tampere: University of Tampere Tampere), 217-231 · Zbl 1145.62341
[80] Nordhausen, K.; Oja, H.; Tyler, D. E., Tools for exploring multivariate data: The package, J. Stat. Softw., 28, 1-31 (2008)
[81] Nordhausen, K.; Oja, H.; Tyler, D., Asymptotic and bootstrap tests for subspace dimension, J. Multivariate Anal., Article 104830 pp. (2021), (online first)
[82] Nordhausen, K.; Oja, H.; Tyler, D.; Virta, J., Asymptotic and bootstrap tests for the dimension of the non-Gaussian subspace, IEEE Signal Process. Lett., 24, 887-891 (2017)
[83] Nordhausen, K.; Oja, H.; Tyler, D. E.; Virta, J., : Estimating and testing the number of interesting components in linear dimension reduction (2021), R package version 0.3-4
[84] K. Nordhausen, E. Ollila, H. Oja, On the performance indices of ICA and blind source separation, in: 2011 IEEE 12th International Workshop on Signal Processing Advances in Wireless Communications, 2011, pp. 486-490.
[85] Nordhausen, K.; Tyler, D. E., A cautionary note on robust covariance plug-in methods, Biometrika, 102, 3, 573-588 (2015) · Zbl 1452.62416
[86] Nordhausen, K.; Virta, J., An overview of properties and extensions of FOBI, Knowl.-Based Syst., 173, 113-116 (2019)
[87] Oja, H., Multivariate Nonparametric Methods with R. an Approach Based on Spatial Signs and Ranks (2010), Springer: Springer New York · Zbl 1269.62036
[88] Oja, H.; Sirkiä, S.; Eriksson, J., Scatter matrices and independent component analysis, Austrian J. Stat., 35, 175-189 (2006)
[89] Pan, Y.; Matilainen, M.; Taskinen, S.; Nordhausen, K., A review of second-order blind identification methods, WIREs Comput. Stat., n/a, Article e1550 pp. (2021)
[90] Peña, D.; Prieto, F. J.; Viladomat, J., Eigenvectors of a kurtosis matrix as interesting directions to reveal cluster structure, J. Multivariate Anal., 101, 9, 1995-2007 (2010) · Zbl 1203.62114
[91] Puri, M. L.; Sen, P. K., Nonparametric Methods in Multivariate Analysis (1971), John Wiley & Sons: John Wiley & Sons New York, USA · Zbl 0237.62033
[92] R Core Team, M. L., R: A Language and Environment for Statistical Computing (2021), R Foundation for Statistical Computing: R Foundation for Statistical Computing Vienna, Austria
[93] Radojicic, U.; Nordhausen, K., Non-Gaussian component analysis: Testing the dimension of the signal subspace, (Maciak, M.; Pesta, M.; Schindler, M., Analytical Methods in Statistics. Analytical Methods in Statistics, AMISTAT 2019 (2020), Springer: Springer Cham), 101-123 · Zbl 1455.62105
[94] Radojicic, U.; Nordhausen, K.; Oja, H., Notion of information and independent component analysis, Appl. Math., 65, 311-330 (2020) · Zbl 07217113
[95] Schott, J. R., Matrix Analysis for Statistics (2005), John Wiley & Sons: John Wiley & Sons Hoboken · Zbl 1076.15002
[96] Serfling, R., Equivariance and invariance properties of multivariate quantile and related functions, and the role of standardisation, J. Nonparametr. Stat., 22, 915-936 (2010) · Zbl 1203.62103
[97] Serfling, R., On invariant within equivalence coordinate system (IWECS) transformations, (Nordhausen, K.; Taskinen, S., Modern Nonparametric, Robust and Multivariate Methods: Festschrift in Honour of Hannu Oja (2015), Springer International Publishing: Springer International Publishing Cham), 445-457 · Zbl 1336.62003
[98] Tang, A. C.; Liu, J.-Y.; Sutherland, M. T., Recovery of correlated neuronal sources from EEG: The good and bad ways of using SOBI, NeuroImage, 28, 507-519 (2005)
[99] Taskinen, S.; Miettinen, J.; Nordhausen, K., A more efficient second order blind identification method for separation of uncorrelated stationary time series, Statist. Probab. Lett., 116, 21-26 (2016) · Zbl 1419.62251
[100] Theis, F.; Inouye, Y., On the use of joint diagonalization in blind signal processing, (IEEE International Symposium on Circuits and Systems (2006), IEEE), 3589-3593
[101] Tong, L.; Soon, V.; Huang, Y.; Liu, R., AMUSE: A new blind identification algorithm, (Proceedings of IEEE International Symposium on Circuits and Systems (1990), IEEE), 1784-1787
[102] Tyler, D. E.; Critchley, F.; Dümbgen, L.; Oja, H., Invariant coordinate selection, J. R. Stat. Soc. Ser. B Stat. Methodol., 71, 3, 549-592 (2009) · Zbl 1250.62032
[103] Virta, J., One-step M-estimates of scatter and the independence property, Statist. Probab. Lett., 110, 133-136 (2016) · Zbl 1338.62097
[104] Virta, J.; Li, B.; Nordhausen, K.; Oja, H., Independent component analysis for tensor-valued data, J. Multivariate Anal., 162, 172-192 (2017) · Zbl 1381.62107
[105] Virta, J.; Li, B.; Nordhausen, K.; Oja, H., Independent component analysis for multivariate functional data, J. Multivariate Anal., 176, Article 104568 pp. (2020) · Zbl 1436.62232
[106] Weisberg, S., Dimension reduction regression in R, J. Stat. Softw., 7, 1, 1-22 (2002)
[107] Yeredor, A., Non-orthogonal joint diagonalization in the least-squares sense with application in blind source separation, IEEE Trans. Signal Process., 50, 7, 1545-1553 (2002) · Zbl 1369.15005
[108] Ziehe, A.; Laskov, P.; Nolte, G.; Müller, K.-R., A fast algorithm for joint diagonalization with non-orthogonal transformations and its application to blind source separation, J. Mach. Learn. Res., 5, Jul, 777-800 (2004) · Zbl 1222.65043
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.