×

Computational aspects of algorithms for variable selection in the context of principal components. (English) Zbl 1429.62216

Summary: Variable selection consists in identifying a \(k\)-subset of a set of original variables that is optimal for a given criterion of adequate approximation to the whole data set. Several algorithms for the optimization problems resulting from three different criteria in the context of principal components analysis are considered, and computational results are presented.

MSC:

62H25 Factor analysis and principal components; correspondence analysis
62-08 Computational methods for problems pertaining to statistics
90C59 Approximation methods and heuristics in mathematical programming

Software:

SoDA; R
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Aarts, E. H.L.; Korst, J. H.M.; van Laarhoven, P. J.M., Simulated annealing, (Aarts, E.; Lenstra, J. K., Local Search in Combinatorial Optimization (1997), Wiley: Wiley New York), 91-120 · Zbl 0905.90140
[2] Becker, R., Chambers, J., Wilks, A., 1988. The New S Language. A programming Environment for Data Analysis and Graphics. Wadsworth and Brooks/Cole Advanced Books and Software, Pacific Grove, CA.; Becker, R., Chambers, J., Wilks, A., 1988. The New S Language. A programming Environment for Data Analysis and Graphics. Wadsworth and Brooks/Cole Advanced Books and Software, Pacific Grove, CA. · Zbl 0642.68003
[3] Bonifas, I.; Escoufier, Y.; Gonzalez, P. L.; Sabatier, R., Choix de variables en analyse en composants principales, Rev. Statist. Appl, 23, 5-15 (1984) · Zbl 0583.62053
[4] Cadima, J.; Jolliffe, I. T., Loadings and correlations in the interpretation of principal components, J. Appl. Statist, 22, 2, 203-214 (1995)
[5] Cadima, J.; Jolliffe, I. T., Variable selection and the interpretation of principal subspaces, J. Agricultural, Biol. Environ. Statist, 6, 1, 62-79 (2001)
[6] Draper, N.; Smith, H., Applied Regression Analysis (1998), Wiley: Wiley New York · Zbl 0158.17101
[7] Duarte Silva, A. P., Efficient variable screening for multivariate analysis, J. Multivariate Anal, 76, 35-62 (2001) · Zbl 0996.62063
[8] Duarte Silva, A. P., Discarding variables in principal component analysis algorithms for all-subset comparisons, Comput. Statist, 17, 251-271 (2002) · Zbl 1010.62052
[9] de Falguerolles, A.; Jmel, S., Un critère de choix de variables en analyse en composants principales fondé sur des modèles graphiques gaussiens particuliers, Canad. J. Statist, 21, 3, 239-256 (1993) · Zbl 0785.62062
[10] Fedorov, V., Gruben, D., Leonov, S., 1999. Direct method of selecting informative variables. Smith Kline Beecham Biostatistics and Data Sciences Technical Report 1999-02.; Fedorov, V., Gruben, D., Leonov, S., 1999. Direct method of selecting informative variables. Smith Kline Beecham Biostatistics and Data Sciences Technical Report 1999-02.
[11] Furnival, G. M.; Wilson, R. W., Regressions by leaps and bounds, Technometrics, 16, 499-511 (1974), (reprinted in Technometrics 42, 1, 69-79) · Zbl 0294.62079
[12] Golub, G.; Van Loan, C., Matrix Computations (1996), Johns Hopkins University Press: Johns Hopkins University Press Baltimore, MD · Zbl 0865.65009
[13] Gonzalez, P. L.; Evry, R.; Cléroux, R.; Rioux, B., Selecting the best subset of variables in principal component analysis, (Momirovic, K.; Mildner, V. (1990), Physica-Verlag: Physica-Verlag Compstat), 115-120
[14] Jolliffe, I. T., Discarding variables in a principal component analysis, IArtificial data, Appl. Statist, 21, 160-173 (1972)
[15] Jolliffe, I. T., Discarding variables in a principal component analysis, IIReal data, Appl. Statist, 22, 21-31 (1973)
[16] Jolliffe, I. T., Principal Component Analysis (2002), Springer: Springer New York · Zbl 1011.62064
[17] Krzanowski, W. J., Selection of variables to preserve multivariate data structure using principal components, Appl. Statist, 36, 22-33 (1987)
[18] McCabe, G. P., Principal variables, Technometrics, 26, 2, 137-144 (1984) · Zbl 0548.62037
[19] McCabe, G.P., 1986. Prediction of principal components by variables subsets. Technical Report 86-19, Department of Statistics, Purdue University.; McCabe, G.P., 1986. Prediction of principal components by variables subsets. Technical Report 86-19, Department of Statistics, Purdue University.
[20] Mühlenbein, H., Genetic algorithms, (Aarts, E.; Lenstra, J. K., Local Search in Combinatorial Optimization (1997), Wiley: Wiley New York), 137-171 · Zbl 0911.68156
[21] Ramsay, J. O.; ten Berge, J.; Styan, G. P.H., Matrix correlation, Psychometrika, 49, 3, 403-423 (1984) · Zbl 0581.62048
[22] Robert, P.; Escoufier, Y., A unifying tool for linear multivariate statistical methodsthe RV-coefficient, Appl. Statist, 25, 3, 257-265 (1976)
[23] Tanaka, Y.; Mori, Y., Principal Component Analysis based on a subset of variablesvariable selection and sensitivity analysis, American J. Math. Management Sci, 17, 1 & 2, 61-89 (1997) · Zbl 1007.62517
[24] Tovey, C. A., Local improvement on discrete structures, (Aarts, E.; Lenstra, J. K., Local Search in Combinatorial Optimization (1997), Wiley: Wiley New York), 57-89 · Zbl 0922.90116
[25] Whittaker, J., Graphical Models in Applied Multivariate Statistics (1990), Wiley: Wiley New York · Zbl 0732.62056
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.