×

Scalable estimation and inference for censored quantile regression process. (English) Zbl 07628845

Summary: Censored quantile regression (CQR) has become a valuable tool to study the heterogeneous association between a possibly censored outcome and a set of covariates, yet computation and statistical inference for CQR have remained a challenge for large-scale data with many covariates. In this paper, we focus on a smoothed martingale-based sequential estimating equations approach, to which scalable gradient-based algorithms can be applied. Theoretically, we provide a unified analysis of the smoothed sequential estimator and its penalized counterpart in increasing dimensions. When the covariate dimension grows with the sample size at a sublinear rate, we establish the uniform convergence rate (over a range of quantile indexes) and provide a rigorous justification for the validity of a multiplier bootstrap procedure for inference. In high-dimensional sparse settings, our results considerably improve the existing work on CQR by relaxing an exponential term of sparsity. We also demonstrate the advantage of the smoothed CQR over existing methods with both simulated experiments and data applications.

MSC:

62F40 Bootstrap, jackknife and other resampling methods
62J05 Linear regression; mixed models
62J07 Ridge regression; shrinkage estimators (Lasso)
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Andersen, P. K., Borgan, Ø., Gill, R. D. and Keiding, N. (1993). Statistical Models Based on Counting Processes. Springer Series in Statistics. Springer, New York. · Zbl 0769.62061
[2] Barbe, P. and Bertail, P. (1995). The Weighted Bootstrap. Lecture Notes in Statistics 98. Springer, New York. · Zbl 0826.62030
[3] BARRODALE, I. and ROBERTS, F. (1974). Solution of an overdetermined system of equations in the \[{\ell_1}\] norm. Commun. ACM 17 319-320.
[4] BELLONI, A. and CHERNOZHUKOV, V. (2011). \[{\ell_1}\]-penalized quantile regression in high-dimensional sparse models. Ann. Statist. 39 82-130. · Zbl 1209.62064
[5] BRADIC, J., FAN, J. and JIANG, J. (2011). Regularization for Cox’s proportional hazards model with NP-dimensionality. Ann. Statist. 39 3092-3120. · Zbl 1246.62202
[6] BUCHINSKY, M. and HAHN, J. (1998). An alternative estimator for the censored quantile regression model. Econometrica 66 653-671. · Zbl 04545704
[7] CAI, T., HUANG, J. and TIAN, L. (2009). Regularized estimation for the accelerated failure time model. Biometrics 65 394-404. · Zbl 1274.62736
[8] CHATTERJEE, S. and BOSE, A. (2005). Generalized bootstrap for estimating equations. Ann. Statist. 33 414-436. · Zbl 1065.62073
[9] CHERNOZHUKOV, V. and HONG, H. (2002). Three-step censored quantile regression and extramarital affairs. J. Amer. Statist. Assoc. 97 872-882. · Zbl 1048.62112
[10] DE BACKER, M., EL GHOUCH, A. and VAN KEILEGOM, I. (2019). An adapted loss function for censored quantile regression. J. Amer. Statist. Assoc. 114 1126-1137. · Zbl 1428.62151
[11] DE BACKER, M., EL GHOUCH, A. and VAN KEILEGOM, I. (2020). Linear censored quantile regression: A novel minimum-distance approach. Scand. J. Stat. 47 1275-1306. · Zbl 1467.62061
[12] DE CASTRO, L., GALVAO, A. F., KAPLAN, D. M. and LIU, X. (2019). Smoothed GMM for quantile models. J. Econometrics 213 121-144. · Zbl 1456.62281
[13] DICKSON, E. R., GRAMBSCH, P. M., FLEMING, T. R., FISHER, L. D. and LANGWORTHY, A. (1989). Prognosis in primary biliary cirrhosis: Model for decision making. Hepatology 10 1-7.
[14] EFRON, B. (1967). The two sample problem with censored data. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 4: Biology and Problems of Health 831-853.
[15] Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348-1360. · Zbl 1073.62547
[16] FAN, J. and LI, R. (2002). Variable selection for Cox’s proportional hazards model and frailty model. Ann. Statist. 30 74-99. · Zbl 04571555
[17] FEI, Z., ZHENG, Q., HONG, H. G. and LI, Y. (2021). Inference for high dimensional censored quantile regression. J. Amer. Statist. Assoc. · Zbl 07707210
[18] FERNANDES, M., GUERRE, E. and HORTA, E. (2021). Smoothing quantile regressions. J. Bus. Econom. Statist. 39 338-357.
[19] FLEMING, T. R. and HARRINGTON, D. P. (1991). Counting Processes and Survival Analysis. Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics. Wiley, New York.
[20] FYGENSON, M. and RITOV, Y. (1994). Monotone estimating equations for censored data. Ann. Statist. 22 732-746. · Zbl 0807.62032
[21] GILL, R. D. and JOHANSEN, S. (1990). A survey of product-integration with a view toward application in survival analysis. Ann. Statist. 18 1501-1555. · Zbl 04500896
[22] Gu, Y., Fan, J., Kong, L., Ma, S. and Zou, H. (2018). ADMM for high-dimensional sparse penalized quantile regression. Technometrics 60 319-331.
[23] HE, X., PAN, X., TAN, K. M. and ZHOU, W.-X. (2021). Smoothed quantile regression with large-scale inference. J. Econometrics · Zbl 07648718
[24] HE, X., PAN, X., TAN, K. M. and ZHOU, W.-X. (2022). Supplement to “Scalable estimation and inference for censored quantile regression process.” https://doi.org/10.1214/22-AOS2214SUPP · Zbl 07628845
[25] HONORÉ, B., KHAN, S. and POWELL, J. L. (2002). Quantile regression under random censoring. J. Econometrics 109 67-105. · Zbl 04571885
[26] HU, F. and KALBFLEISCH, J. D. (2000). The estimating function bootstrap. Canad. J. Statist. 28 449-499. · Zbl 0977.62045
[27] Huang, J., Ma, S. and Xie, H. (2006). Regularized estimation in the accelerated failure time model with high-dimensional covariates. Biometrics 62 813-820. · Zbl 1111.62090
[28] HUANG, Y. (2010). Quantile calculus and censored regression. Ann. Statist. 38 1607-1637. · Zbl 1189.62071
[29] JIN, Z., YING, Z. and WEI, L. J. (2001). A simple resampling method by perturbing the minimand. Biometrika 88 381-390. · Zbl 0984.62033
[30] KAPLAN, D. M. and SUN, Y. (2017). Smoothed estimating equations for instrumental variables quantile regression. Econometric Theory 33 105-157. · Zbl 1441.62768
[31] KLEINBAUM, D. G. and KLEIN, M. (2012). Survival Analysis: A Self-Learning Text, 3rd ed. Statistics for Biology and Health. Springer, New York. · Zbl 1228.62122
[32] Koenker, R. (2005). Quantile Regression. Econometric Society Monographs 38. Cambridge Univ. Press, Cambridge.
[33] KOENKER, R. (2008). Censored quantile regression redux. J. Stat. Softw. 38 1-25.
[34] KOENKER, R., CHERNOZHUKOV, V., HE, X. and PENG, L. (2017). Handbook of Quantile Regression. CRC Press, New York.
[35] KOENKER, R. and GELING, O. (2001). Reappraising medfly longevity: A quantile regression survival analysis. J. Amer. Statist. Assoc. 96 458-468. · Zbl 1019.62100
[36] KOENKER, R. and MIZERA, I. (2014). Convex optimization in R. J. Stat. Softw. 60. · Zbl 1367.62020
[37] KOENKER, R. and NG, P. (2005). A Frisch-Newton algorithm for sparse quantile regression. Acta Math. Appl. Sin. Engl. Ser. 21 225-236. · Zbl 1097.62028
[38] LENG, C. and TONG, X. (2013). A quantile regression estimator for censored data. Bernoulli 19 344-361. · Zbl 1259.62019
[39] MA, S. and KOSOROK, M. R. (2005). Robust semiparametric M-estimation and the weighted bootstrap. J. Multivariate Anal. 96 190-217. · Zbl 1073.62030
[40] NEOCLEOUS, T., VANDEN BRANDEN, K. and PORTNOY, S. (2006). Correction to: “Censored regression quantiles” [J. Amer. Statist. Assoc. 98(464) (2003), 1001-1012; MR2041488] by Portnoy. J. Amer. Statist. Assoc. 101 860-861.
[41] PARZEN, M. I., WEI, L. J. and YING, Z. (1994). A resampling method based on pivotal estimating functions. Biometrika 81 341-350. · Zbl 0807.62038
[42] PENG, L. (2012). Self-consistent estimation of censored quantile regression. J. Multivariate Anal. 105 368-379. · Zbl 1236.62034
[43] PENG, L. (2021). Quantile regression for survival data. Annu. Rev. Stat. Appl. 8 413-437.
[44] PENG, L. and HUANG, Y. (2008). Survival analysis with quantile regression models. J. Amer. Statist. Assoc. 103 637-649. · Zbl 1408.62159
[45] PORTNOY, S. (2003). Censored regression quantiles. J. Amer. Statist. Assoc. 98 1001-1012. · Zbl 1045.62099
[46] PORTNOY, S. and KOENKER, R. (1997). The Gaussian hare and the Laplacian tortoise: Computability of squared-error versus absolute-error estimators. Statist. Sci. 12 279-300. · Zbl 0955.62608
[47] PORTNOY, S. and LIN, G. (2010). Asymptotics for censored regression quantiles. J. Nonparametr. Stat. 22 115-130. · Zbl 1184.62168
[48] POWELL, J. L. (1984). Least absolute deviations estimation for the censored regression model. J. Econometrics 25 303-325. · Zbl 0571.62100
[49] POWELL, J. L. (1986). Censored regression quantiles. J. Econometrics 32 143-155. · Zbl 0605.62139
[50] SHEDDEN, K., TAYLOR, J. M., ENKEMANN, S. A. (2008). Gene expression-based survival prediction in lung adenocarcinoma: A multi-site, blinded validation study. Nat. Med. 14 822-827.
[51] SHERWOOD, B. and MAIDMAN, A. (2020). Package ‘rqPen’, version 2.2.2. Reference manual. https://cran.r-project.org/web/packages/rqPen/rqPen.pdf.
[52] SHOWS, J. H., LU, W. and ZHANG, H. H. (2010). Sparse estimation and inference for censored median regression. J. Statist. Plann. Inference 140 1903-1917. · Zbl 1184.62172
[53] SUN, X., PENG, L., HUANG, Y. and LAI, H. J. (2016). Generalizing quantile regression for counting processes with applications to recurrent events. J. Amer. Statist. Assoc. 111 145-156.
[54] TAN, K. M., WANG, L. and ZHOU, W.-X. (2022). High-dimensional quantile regression: Convolution smoothing and concave regularization. J. R. Stat. Soc. Ser. B. Stat. Methodol. 84 205-233. · Zbl 07593409
[55] THERNEAU, T. M., GRAMBSCH, P. M. and FLEMING, T. R. (1990). Martingale-based residuals for survival models. Biometrika 77 147-160. · Zbl 0692.62082
[56] TIAN, L., ZUCKER, D. and WEI, L. J. (2005). On the Cox model with time-varying regression coefficients. J. Amer. Statist. Assoc. 100 172-183. · Zbl 1117.62435
[57] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267-288. · Zbl 0850.62538
[58] VOLGUSHEV, S., WAGENER, J. and DETTE, H. (2014). Censored quantile regression processes under dependence and penalization. Electron. J. Stat. 8 2405-2447. · Zbl 1349.62488
[59] WANG, H. J. and WANG, L. (2009). Locally weighted censored quantile regression. J. Amer. Statist. Assoc. 104 1117-1128. · Zbl 1388.62289
[60] WANG, H. J., ZHOU, J. and LI, Y. (2013). Variable selection for censored quantile regresion. Statist. Sinica 23 145-167. · Zbl 1257.62046
[61] WHANG, Y.-J. (2006). Smoothed empirical likelihood methods for quantile regression models. Econometric Theory 22 173-205. · Zbl 1138.62017
[62] WU, Y., MA, Y. and YIN, G. (2015). Smoothed and corrected score approach to censored quantile regression with measurement errors. J. Amer. Statist. Assoc. 110 1670-1683. · Zbl 1373.62164
[63] YANG, X., NARISETTY, N. N. and HE, X. (2018). A new approach to censored quantile regression estimation. J. Comput. Graph. Statist. 27 417-425. · Zbl 07498958
[64] YING, Z., JUNG, S. H. and WEI, L. J. (1995). Survival analysis with median regression models. J. Amer. Statist. Assoc. 90 178-184. · Zbl 0818.62103
[65] YU, L., LIN, N. and WANG, L. (2017). A parallel algorithm for large-scale nonconvex penalized quantile regression. J. Comput. Graph. Statist. 26 935-939.
[66] Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. Ann. Statist. 38 894-942. · Zbl 1183.62120
[67] ZHENG, Q., PENG, L. and HE, X. (2018). High dimensional censored quantile regression. Ann. Statist. 46 308-343. · Zbl 1416.62236
[68] Zou, H. (2006). The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc. 101 1418-1429. · Zbl 1171.62326
[69] Zou, H. and Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models. Ann. Statist. 36 1509-1533 · Zbl 1142.62027
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.