Inverse regression for ridge recovery: a data-driven approach for parameter reduction in computer experiments. (English) Zbl 1436.62337

Summary: Parameter reduction can enable otherwise infeasible design and uncertainty studies with modern computational science models that contain several input parameters. In statistical regression, techniques for sufficient dimension reduction (SDR) use data to reduce the predictor dimension of a regression problem. A computational scientist hoping to use SDR for parameter reduction encounters a problem: a computer prediction is best represented by a deterministic function of the inputs, so data comprised of computer simulation queries fail to satisfy the SDR assumptions. To address this problem, we interpret SDR methods sliced inverse regression (SIR) and sliced average variance estimation (SAVE) as estimating the directions of a ridge function, which is a composition of a low-dimensional linear transformation with a nonlinear function. Within this interpretation, SIR and SAVE estimate matrices of integrals whose column spaces are contained in the ridge directions’ span; we analyze and numerically verify convergence of these column spaces as the number of computer model queries increases. Moreover, we show example functions that are not ridge functions but whose inverse conditional moment matrices are low-rank. Consequently, the computational scientist should beware when using SIR and SAVE for parameter reduction, since SIR and SAVE may mistakenly suggest that truly important directions are unimportant.


62J07 Ridge regression; shrinkage estimators (Lasso)
62G08 Nonparametric regression and quantile regression
62H25 Factor analysis and principal components; correspondence analysis
Full Text: DOI arXiv


[1] Abbott, S., Understanding Analysis (2001), New York: Springer, New York · Zbl 0966.26001
[2] Adragni, Kp; Cook, Rd, Sufficient dimension reduction and prediction in regression, Philos. Trans. R. Soc. Lond. A Math. Phys. Eng. Sci., 367, 1906, 4385-4405 (2009) · Zbl 1185.62109
[3] Allaire, D.; Willcox, K., Surrogate modeling for uncertainty assessment with application to aviation environmental system models, AIAA J., 48, 8, 1791-1803 (2010)
[4] Challenor, P.; Dienstfrey, Am; Boisvert, Rf, Using emulators to estimate uncertainty in complex models, Uncertainty Quantification in Scientific Computing, 151-164 (2012), Berlin, Heidelberg: Springer, Berlin, Heidelberg · Zbl 1256.65007
[5] Chang, Jt; Pollard, D., Conditioning as disintegration, Stat. Neerl., 51, 3, 287-317 (1997) · Zbl 0889.62003
[6] Cohen, A.; Daubechies, I.; Devore, R.; Kerkyacharian, G.; Picard, D., Capturing ridge functions in high dimensions from point queries, Constr. Approx., 35, 2, 225-243 (2012) · Zbl 1318.62286
[7] Constantine, Pg, Active Subspaces: Emerging Ideas for Dimension Reduction in Parameter Studies (2015), Philadelphia: SIAM, Philadelphia · Zbl 1431.65001
[8] Constantine, P.G., Eftekhari, A., Wakin, M.B.: Computing active subspaces efficiently with gradient sketching. In: 2015 IEEE 6th International Workshop on Computational Advances in Multi-sensor Adaptive Processing (CAMSAP), pp. 353-356. (2015)
[9] Constantine, Pg; Howard, R.; Glaws, A.; Grey, Z.; Diaz, P.; Fletcher, L., Python active-subspaces utility library, J. Open Source Softw., 1, 79 (2016)
[10] Constantine, P.G., del Rosario, Z., Iaccarino, G.: Data-driven dimensional analysis: algorithms for unique and relevant dimensionless groups. (2017a). arXiv:1708.04303
[11] Constantine, Pg; Eftekhari, A.; Hokanson, J.; Ward, R., A near-stationary subspace for ridge approximation, Comput. Methods Appl. Mech. Eng., 326, 402-421 (2017)
[12] Cook, Rd, On the interpretation of regression plots, J. Am. Stat. Assoc., 89, 425, 177-189 (1994) · Zbl 0791.62066
[13] Cook, R.D.: Using dimension-reduction subspaces to identify important inputs in models of physical systems. In: Proceedings of the Section on Physical and Engineering Sciences, pp. 18-25. American Statistical Association, Alexandria, VA (1994b)
[14] Cook, Rd, Graphics for regressions with a binary response, J. Am. Stat. Assoc., 91, 435, 983-992 (1996) · Zbl 0882.62060
[15] Cook, Rd, Regression Graphics: Ideas for Studying Regression Through Graphics (1998), New York: Wiley, New York · Zbl 0903.62001
[16] Cook, Rd, SAVE: a method for dimension reduction and graphics in regression, Commun. Stat. Theory Methods, 29, 9-10, 2109-2121 (2000) · Zbl 1061.62503
[17] Cook, Rd; Forzani, L., Likelihood-based sufficient dimension reduction, J. Am. Stat. Assoc., 104, 485, 197-208 (2009) · Zbl 1388.62041
[18] Cook, Rd; Weisberg, S., Sliced inverse regression for dimension reduction: comment, J. Am. Stat. Assoc., 86, 414, 328-332 (1991) · Zbl 1353.62037
[19] Cowling, T.; Lindsay, Rb, Magnetohydrodynamics, Phys. Today, 10, 40 (1957)
[20] Donoho, D.L.: High-dimensional data analysis: the curses and blessings of dimensionality. In: AMS Conference on Math Challenges of the 21st Century (2000)
[21] Eaton, Ml, A characterization of spherical distributions, J. Multivar. Anal., 20, 2, 272-276 (1986) · Zbl 0596.62057
[22] Folland, Gb, Real Analysis: Modern Techniques and Their Applications (1999), New York: Wiley, New York · Zbl 0924.28001
[23] Fornasier, M.; Schnass, K.; Vybiral, J., Learning functions of few arbitrary linear parameters in high dimensions, Found. Comput. Math., 12, 2, 229-262 (2012) · Zbl 1252.65036
[24] Friedman, Jh; Stuetzle, W., Projection pursuit regression, J. Am. Stat. Assoc., 76, 376, 817-823 (1980)
[25] Ghanem, R.; Higdon, D.; Owhadi, H., Handbook of Uncertainty Quantification (2016), Berlin: Springer, Berlin
[26] Glaws, A.; Constantine, Pg, Gauss-Christoffel quadrature for inverse regression: applications to computer experiments, Stat. Comput., 29, 429-447 (2018) · Zbl 1430.62014
[27] Glaws, A.; Constantine, Pg; Shadid, J.; Wildey, Tm, Dimension reduction in magnetohydrodynamics power generation models: dimensional analysis and active subspaces, Stat. Anal. Data Min., 10, 5, 312-325 (2017)
[28] Golub, Gh; Van Loan, Cf, Matrix Computations (2013), Baltimore: JHU Press, Baltimore
[29] Goodfellow, I.; Bengio, Y.; Courville, A., Deep Learning (2016), Cambridge: MIT Press, Cambridge · Zbl 1373.68009
[30] Hastie, T.; Tibshirani, R.; Friedman, J., The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2009), New York: Springer, New York · Zbl 1273.62005
[31] Hokanson, J., Constantine, P.: Data-driven polynomial ridge approximation using variable projection. (2017). arXiv:1702.05859 · Zbl 1392.49034
[32] Jones, Dr, A taxonomy of global optimization methods based on response surfaces, J. Glob. Optim., 21, 4, 345-383 (2001) · Zbl 1172.90492
[33] Koehler, Jr; Owen, Ab; Ghosh, S.; Rao, Cr, Computer experiments, Handbook of Statistics, 261-308 (1996), Amsterdam: Elsevier, Amsterdam
[34] Li, Kc, Sliced inverse regression for dimension reduction, J. Am. Stat. Assoc., 86, 414, 316-327 (1991) · Zbl 0742.62044
[35] Li, Kc, On principal Hessian directions for data visualization and dimension reduction: another application of Stein’s lemma, J. Am. Stat. Assoc., 87, 420, 1025-1039 (1992) · Zbl 0765.62003
[36] Li, B., Sufficient Dimension Reduction: Methods and Applications with R (2018), Philadelphia: CRC Press, Philadelphia
[37] Li, K-C; Duan, N., Regression analysis under link violation, Ann. Stat., 17, 3, 1009-1052 (1989) · Zbl 0753.62041
[38] Li, W.; Lin, G.; Li, B., Inverse regression-based uncertainty quantification algorithms for high-dimensional models: theory and practice, J. Comput. Phys., 321, 259-278 (2016) · Zbl 1349.62133
[39] Myers, Rh; Montgomery, Dc, Response Surface Methodology: Process and Product Optimization Using Designed Experiments (1995), New York: Wiley, New York · Zbl 1161.62392
[40] Pan, Q.; Dias, D., Sliced inverse regression-based sparse polynomial chaos expansions for reliability analysis in high dimensions, Reliab. Eng. Syst. Saf., 167, 484-493 (2017)
[41] Pinkus, A., Ridge Functions (2015), Cambridge: Cambridge University Press, Cambridge · Zbl 1331.41001
[42] Razavi, S.; Tolson, Ba; Burn, Dh, Review of surrogate modeling in water resources, Water Resour. Res., 48, 7, W07401 (2012)
[43] Sacks, J.; Welch, Wj; Mitchell, Tj; Wynn, Hp, Design and analysis of computer experiments, Stat. Sci., 4, 4, 409-423 (1989)
[44] Santner, Tj; Williams, Bj; Notz, Wi, The Design and Analysis of Computer Experiments (2003), New York: Springer, New York
[45] Smith, Rc, Uncertainty Quantification: Theory, Implementation, and Applications (2013), Philadelphia: SIAM, Philadelphia
[46] Sullivan, T., Introduction to Uncertainty Quantification (2015), New York: Springer, New York · Zbl 1336.60002
[47] Traub, Jf; Werschulz, Ag, Complexity and Information (1998), Cambridge: Cambridge University Press, Cambridge
[48] Tyagi, H.; Cevher, V., Learning non-parametric basis independent models from point queries via low-rank methods, Appl. Comput. Harmon. Anal., 37, 3, 389-412 (2014) · Zbl 1373.68336
[49] Wang, Gg; Shan, S., Review of metamodeling techniques in support of engineering design optimization, J. Mech. Des., 129, 4, 370-380 (2006)
[50] Weisberg, S., Applied Linear Regression (2005), New York: Wiley, New York · Zbl 1068.62077
[51] Yin, X.; Li, B.; Cook, Rd, Successive direction extraction for estimating the central subspace in a multiple-index regression, J. Multivar. Anal., 99, 8, 1733-1757 (2008) · Zbl 1144.62030
[52] Zhang, J.; Li, W.; Lin, G.; Zeng, L.; Wu, L., Efficient evaluation of small failure probability in high-dimensional groundwater contaminant transport modeling via a two-stage Monte Carlo method, Water Resour. Res., 53, 3, 1948-1962 (2017)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.