×

Smoothed quantile regression with large-scale inference. (English) Zbl 07648718

Summary: Quantile regression is a powerful tool for learning the relationship between a response variable and a multivariate predictor while exploring heterogeneous effects. This paper focuses on statistical inference for quantile regression in the “increasing dimension” regime. We provide a comprehensive analysis of a convolution smoothed approach that achieves adequate approximation to computation and inference for quantile regression. This method, which we refer to as conquer, turns the non-differentiable check function into a twice-differentiable, convex and locally strongly convex surrogate, which admits fast and scalable gradient-based algorithms to perform optimization, and multiplier bootstrap for statistical inference. Theoretically, we establish explicit non-asymptotic bounds on estimation and Bahadur-Kiefer linearization errors, from which we show that the asymptotic normality of the conquer estimator holds under a weaker requirement on dimensionality than needed for conventional quantile regression. The validity of multiplier bootstrap is also provided. Numerical studies confirm conquer as a practical and reliable approach to large-scale inference for quantile regression. Software implementing the methodology is available in the R package conquer.

MSC:

62-XX Statistics
91-XX Game theory, economics, finance, and other social and behavioral sciences
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Amemiya, T., Two stage least absolute deviations estimators, Econometrica, 50, 689-711 (1982) · Zbl 0493.62098
[2] Arcones, M. A., The Bahadur-Kiefer representation of \(L_p\) regression estimators, Econom. Theory, 12, 257-283 (1996)
[3] Barbe, P.; Bertail, P., (The Weighted Bootstrap. The Weighted Bootstrap, Lecture Notes in Statistics, vol. 98 (1995), Springer: Springer New York) · Zbl 0826.62030
[4] Barrodale, I.; Roberts, F., Solution of an overdetermined system of equations in the \(\ell_1\) norm, Commun. ACM, 17, 319-320 (1974)
[5] Barzilai, J.; Borwein, J. M., Two-point step size gradient methods, IMA J. Numer. Anal., 8, 141-148 (1988) · Zbl 0638.65055
[6] Belloni, A.; Chernozhukov, V., \( \ell_1\)-regularized quantile regression in high-dimensional sparse models, Ann. Statist., 39, 82-130 (2011) · Zbl 1209.62064
[7] Belloni, A.; Chernozhukov, V.; Chetverikov, D.; Fernandez-Val, I., Conditional quantile processes based on series or many regressors, J. Econometrics, 213, 4-29 (2019) · Zbl 1456.62067
[8] Bertsimas, D.; King, A.; Mazumder, R., Best subset selection via a modern optimization lens, Ann. Statist., 44, 813-852 (2016) · Zbl 1335.62115
[9] Bickel, P. J., One-step Huber estimates in the linear model, J. Amer. Statist. Assoc., 70, 428-434 (1975) · Zbl 0322.62038
[10] Boyd, S.; Vandenberghe, L., Convex Optimization (2004), Cambridge University Press: Cambridge University Press Cambridge · Zbl 1058.90049
[11] de Castro, L.; Galvao, A. F.; Kaplan, D. M.; Liu, X., Smoothed GMM for quantile models, J. Econometrics, 213, 121-144 (2019) · Zbl 1456.62281
[12] Chatterjee, S.; Bose, A., Generalized bootstrap for estimating equations, Ann. Statist., 33, 414-436 (2005) · Zbl 1065.62073
[13] Chen, X.; Hong, H.; Tarozzi, A., Semiparametric efficiency in GMM models with auxiliary data, Ann. Statist., 36, 808-843 (2008) · Zbl 1133.62023
[14] Chen, L.-Y.; Lee, S., Exact computation of GMM estimators for instrumental variable quantile regression models, J. Appl. Econometrics, 33, 553-567 (2018)
[15] Chen, X.; Linton, O.; Van Keilegom, I., Estimation of semiparametric models when the criterion function is not smooth, Econometrica, 71, 1591-1608 (2003) · Zbl 1154.62325
[16] Chen, X.; Liu, W.; Zhang, Y., Quantile regression under memory constraint, Ann. Statist., 47, 3244-3273 (2019) · Zbl 1436.62134
[17] Chen, X.; Pouzo, D., Efficient estimation of semiparametric conditional moment models with possibly nonsmooth residuals, J. Econometrics, 152, 46-60 (2009) · Zbl 1431.62111
[18] Chen, X.; Zhou, W.-X., Robust inference via multiplier bootstrap, Ann. Statist., 48, 1665-1691 (2020) · Zbl 1458.62075
[19] Cheng, G.; Huang, J. Z., Bootstrap consistency for general semiparametric \(M\)-estimation, Ann. Statist., 38, 2884-2915 (2010) · Zbl 1200.62042
[20] Chernozhukov, V.; Hansen, C., An IV model of quantile treatment effects, Econometrica, 73, 245-261 (2005) · Zbl 1152.91706
[21] Chernozhukov, V.; Hansen, C., Instrumental quantile regression inference for structural and treatment effects models, J. Econometrics, 132, 491-525 (2006) · Zbl 1337.62353
[22] Chernozhukov, V.; Hansen, C.; Wüthrich, K., Instrumental variable quantile regression (2020), arXiv:2009.00436
[23] Chernozhukov, V.; Umantsev, L., Conditional value-at-risk: Aspects of modeling and estimation, Empir. Econ., 26, 271-293 (2001)
[24] Dudewicz, E. J., The generalized bootstrap, (Bootstrapping and Related Techniques. Bootstrapping and Related Techniques, Lecture Notes in Economics and Mathematical Systems, vol. 376 (1992), Springer-Verlag: Springer-Verlag Berlin), 31-37
[25] Eddelbuettel, D.; Sanderson, C., RcppArmadillo: Accelerating R with high-performance \(C + +\) linear algebra, Comput. Statist. Data Anal., 71, 1054-1063 (2014) · Zbl 1471.62055
[26] Falk, M., A simple approach to the generation of uniformly distributed random variables with prescribed correlations, Comm. Statist. Simulation Comput., 28, 785-791 (1999) · Zbl 0968.65502
[27] Fang, K. T.; Kotz, S.; Ng, K. W., Symmetric Multivariate and Related Distributions (1990), Chapman & Hall: Chapman & Hall London
[28] Feng, X.; He, X.; Hu, J., Wild bootstrap for quantile regression, Biometrika, 98, 995-999 (2011) · Zbl 1228.62053
[29] Fernandes, M.; Guerre, E.; Horta, E., Smoothing quantile regressions, J. Bus. Econom. Statist., 39, 338-357 (2021)
[30] Firpo, S., Efficient semiparametric estimation of quantile treatment effects, Econometrica, 75, 259-276 (2007) · Zbl 1201.62043
[31] Galvao, A. F.; Kato, K., Smoothed quantile regression for panel data, J. Econometrics, 193, 92-112 (2016) · Zbl 1420.62483
[32] Gu, Y.; Fan, J.; Kong, L.; Ma, S.; Zou, H., ADMM for high-dimensional sparse regularized quantile regression, Technometrics, 60, 319-331 (2018)
[33] Gu, S.; Kelly, B.; Xiu, D., Empirical asset pricing via machine learning, Rev. Financial Stud., 33, 2223-2273 (2020)
[34] Gutenbrunner, C.; Jurečková, J., Regression rank scores and regression quantiles, Ann. Statist., 20, 305-330 (1992) · Zbl 0759.62015
[35] He, X.; Hu, F., Markov chain marginal bootstrap, J. Amer. Statist. Assoc., 97, 783-795 (2002) · Zbl 1048.62032
[36] He, X.; Shao, Q.-M., A general Bahadur representation of \(M\)-estimators and its application to linear regression with nonstochastic designs, Ann. Statist., 24, 2608-2630 (1996) · Zbl 0867.62012
[37] He, X.; Shao, Q.-M., On parameters of increasing dimensions, J. Multivariate Anal., 73, 120-135 (2000) · Zbl 0948.62013
[38] Horowitz, J. L., Bootstrap methods for median regression models, Econometrica, 66, 1327-1351 (1998) · Zbl 1056.62517
[39] Horowitz, J. L.; Lee, S., Nonparametric instrumental variables estimation of a quantile regression model, Econometrica, 75, 1191-1208 (2007) · Zbl 1134.62024
[40] Huber, P. J., Robust estimation: Asymptotics, conjectures and Monte Carlo, Ann. Statist., 1, 799-821 (1973) · Zbl 0289.62033
[41] Huber, P. J., Robust Statistics (1981), John Wiley & Sons: John Wiley & Sons New York · Zbl 0536.62025
[42] Kaido, H.; Wüthrich, K., Decentralization estimators for instrumental variable quantile regression models, Quant. Econ., 12, 443-475 (2021)
[43] Kaplan, D. M.; Sun, Y., Smoothed estimating equations for instrumental variables quantile regression, Econom. Theory, 33, 105-157 (2017) · Zbl 1441.62768
[44] Kocherginsky, M.; He, X.; Hu, F., Practical confidence intervals for regression quantiles, J. Comput. Graph. Statist., 14, 41-55 (2005)
[45] Koenker, R., Quantile Regression (2005), Cambridge University Press: Cambridge University Press Cambridge · Zbl 1111.62037
[46] Koenker, R., Package , version 5.54 (2019), Reference manual: https://cran.r-project.org/web/packages/quantreg/quantreg.pdf
[47] Koenker, R.; Bassett, G., Regression quantiles, Econometrica, 46, 33-50 (1978) · Zbl 0373.62038
[48] Koenker, R.; Chernozhukov, V.; He, X.; Peng, L., Handbook of Quantile Regression (2017), CRC Press: CRC Press New York
[49] Koenker, R.; d’Orey, V., Computing quantile regressions, J. R. Stat. Soc. Ser. C. Appl. Stat., 36, 383-393 (1987)
[50] Koenker, R.; d’Orey, V., A remark on Algorithm AS 229: Computing dual regression quantiles and regression rank scores, J. R. Stat. Soc. Ser. C. Appl. Stat., 43, 410-414 (1994)
[51] Koenker, R.; Ng, P., SparseM: A sparse matrix package for R, J. Statist. Softw., 8, 1-9 (2003)
[52] Ma, S.; Kosorok, M. R., Robust semiparametric \(M\)-estimation and the weighted bootstrap, J. Multivariate Anal., 96, 190-217 (2005) · Zbl 1073.62030
[53] Machado, J. A.F.; Santos Silva, J. M.C., Quantile via moments, J. Econometrics, 213, 145-173 (2018) · Zbl 1456.62299
[54] Mammen, E., Asymptotics with increasing dimension for robust regression with applications to the bootstrap, Ann. Statist., 17, 382-400 (1989) · Zbl 0674.62017
[55] Nemirovski, A.; Yudin, D., Problem Complexity and Method Efficiency in Optimization (1983), Wiley
[56] Pan, X.; Zhou, W.-X., Multiplier bootstrap for quantile regression: Non-asymptotic theory under random design, Inf. Inference (2020), in press
[57] Parikh, N.; Boyd, S., Proximal algorithms, Found. Trends Optim., 1, 127-239 (2014)
[58] Parzen, M. I.; Wei, L. J.; Ying, Z., A resampling method based on pivotal estimating functions, Biometrika, 81, 341-350 (1994) · Zbl 0807.62038
[59] Portnoy, S., Asymptotic behavior of \(M\)-estimators of \(p\) regression parameters when \(p^2 / n\) is large; II. Normal approximation, Ann. Statist., 13, 1403-1417 (1985) · Zbl 0601.62026
[60] Portnoy, S., Nearly root-\(n\) approximation for regression quantile processes, Ann. Statist., 40, 1714-1736 (2012) · Zbl 1284.62291
[61] Portnoy, S.; Koenker, R., The Gaussian hare and the Laplacian tortoise: Computability of squared-error versus absolute-error estimators, Statist. Sci., 12, 279-300 (1997) · Zbl 0955.62608
[62] van der Vaart, A. W.; Wellner, J. A., Weak Convergence and Empirical Processes: With Applications to Statistics (1996), Springer: Springer New York · Zbl 0862.60002
[63] Vershynin, R., High-Dimensional Probability: An Introduction with Applications in Data Science (2018), Cambridge University Press: Cambridge University Press Cambridge · Zbl 1430.60005
[64] Wainwright, M. J., High-Dimensional Statistics: A Non-Asymptotic Viewpoint (2019), Cambridge University Press: Cambridge University Press Cambridge
[65] Wang, H. J.; Stefanski, L. A.; Zhu, Z., Corrected-loss estimation for quantile regression with covariate measurement errors, Biometrika, 99, 405-421 (2012) · Zbl 1239.62047
[66] Wang, L.; Wu, Y.; Li, R., Quantile regression for analyzing heterogeneity in ultra-high dimension, J. Amer. Statist. Assoc., 107, 214-222 (2012) · Zbl 1328.62468
[67] Welsh, A. H., On \(M\)-processes and \(M\)-estimation, Ann. Statist., 15, 337-361 (1989) · Zbl 0701.62074
[68] Whang, Y.-J., Smoothed empirical likelihood methods for quantile regression models, Econom. Theory, 22, 173-205 (2006) · Zbl 1138.62017
[69] Wu, Y.; Ma, Y.; Yin, G., Smoothed and corrected score approach to censored quantile regression with measurement errors, J. Amer. Statist. Assoc., 110, 1670-1683 (2015) · Zbl 1373.62164
[70] Zheng, Q.; Peng, L.; He, X., Globally adaptive quantile regression with ultra-high dimensional data, Ann. Statist., 43, 2225-2258 (2015) · Zbl 1327.62424
[71] Zhu, Y., K-step correction for mixed integer linear programming: A new approach for instrumental variable quantile regressions and related problems (2018), arXiv:1805.06855
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.