Censored linear model in high dimensions. Penalised linear regression on high-dimensional data with left-censored response variable. (English) Zbl 1341.62218

Summary: Censored data are quite common in statistics and have been studied in depth in the last years (for some references, see [J. L. Powell et al., J. Econom. 25, 303–325 (1984; Zbl 0571.62100); “Semiparametric censored regression models”, J. Econ. Perspect. 15, No. 4, 29–42 (2001; doi:10.1257/jep.15.4.29); S. A. Murphy et al., Math. Methods Stat. 8, No. 3, 407–425 (1999; Zbl 1033.62021)]). In this paper, we consider censored high-dimensional data. High-dimensional models are in some way more complex than their low-dimensional versions, therefore some different techniques are required. For the linear case, appropriate estimators based on penalised regression have been developed in the last years (see for example [P. J. Bickel et al., Ann. Stat. 37, No. 4, 1705–1732 (2009; Zbl 1173.62022); V. Koltchinskii, Bernoulli 15, No. 3, 799–828 (2009; Zbl 1452.62486)]). In particular, in sparse contexts, the \(l_1\)-penalised regression (also known as LASSO) (see [R. Tibshirani, J. R. Stat. Soc., Ser. B 58, No. 1, 267–288 (1996; Zbl 0850.62538); P. Bühlmann and the second author, Statistics for high-dimensional data. Methods, theory and applications. Berlin: Springer (2011; Zbl 1273.62015)] and reference therein) performs very well. Only few theoretical work was done to analyse censored linear models in a high-dimensional context. We therefore consider a high-dimensional censored linear model, where the response variable is left censored. We propose a new estimator, which aims to work with high-dimensional linear censored data. Theoretical non-asymptotic oracle inequalities are derived.


62J05 Linear regression; mixed models
62H12 Estimation in multivariate analysis
62N01 Censored data models
62N02 Estimation in survival analysis and censored data


Full Text: DOI


[1] Belloni, A; Chernozhukov, V, \(ℓ _1\)-penalized quantile regression in high-dimensional sparse models, Ann Stat, 39, 82-130, (2011) · Zbl 1209.62064
[2] Bickel, PJ; Ritov, Y; Tsybakov, AB, Simultaneous analysis of lasso and Dantzig selector, Ann Stat, 37, 1705-1732, (2009) · Zbl 1173.62022
[3] Bühlmann P, van de Geer S (2011) Statistics for high-dimensional data. Springer, Heidelberg · Zbl 1273.62015
[4] Candes, E; Tao, T, The Dantzig selector: statistical estimation when p is much larger than n, Ann Stat, 35, 2313-2351, (2007) · Zbl 1139.62019
[5] Chay, KY; Powell, JL, Semiparametric censored regression models, J Econ Perspect, 15, 29-42, (2001)
[6] Fan, J; Li, R, Variable selection via nonconcave penalized likelihood and its oracle properties, J Am Stat Assoc, 96, 1348-1360, (2001) · Zbl 1073.62547
[7] Friedman, J; Hastie, T; Tibshirani, R, Sparse inverse covariance estimation with the graphical lasso, Biostatistics, 9, 432-441, (2008) · Zbl 1143.62076
[8] Koltchinskii, V, The Dantzig selector and sparsity oracle inequalities, Bernoulli, 15, 799-828, (2009) · Zbl 1452.62486
[9] Koltchinskii V (2011) Oracle inequalities in empirical risk minimization and sparse recovery problems: École dÉté de Probabilités de Saint-Flour XXXVIII-2008, vol 2033. Springer, Berlin · Zbl 1223.91002
[10] Li, Y; Zhu, J, L1-norm quantile regression, J Comput Graph Stat, 17, 163-185, (2008)
[11] Murphy, SA; Vaart, AW; Wellner, JA, Current status regression, Math Methods Stat, 8, 407-425, (1999) · Zbl 1033.62021
[12] Powell, JL, Least absolute deviations estimation for the censored regression model, J Econom, 25, 303-325, (1984) · Zbl 0571.62100
[13] Städler, N; Bühlmann, P; Geer, S, \(l_1\)-penalization for mixture regression models, Test, 19, 209-285, (2010) · Zbl 1203.62128
[14] Tibshirani, R, Regression shrinkage and selection via the lasso, J R Stat Soc Ser B, 58, 267-288, (1996) · Zbl 0850.62538
[15] van de Geer S (2000) Empirical processes in M-estimation. Cambridge University Press, Cambridge · Zbl 1179.62073
[16] Geer, S; Akritas, MG (ed.); Politis, DN (ed.), Adaptive quantile regression, 235-250, (2003), Amsterdam
[17] van de Geer S (2007) The deterministic Lasso. In: JSM Proceedings, 2007, vol 140. American Statistical Association
[18] Geer, S, High-dimensional generalized linear model and the lasso, Ann Stat, 32, 614-645, (2008) · Zbl 1138.62323
[19] Geer, S; Bühlmann, P, On the conditions used to prove oracle results for the lasso, Electron J Stat, 3, 1360-1392, (2009) · Zbl 1327.62425
[20] Yuan, M; Lin, Y, Model selection and estimation in regression with grouped variables, J R Stat Soc: Ser B (Stat Methodol), 68, 49-67, (2006) · Zbl 1141.62030
[21] Zou, H, The adaptive lasso and its oracle properties, J Am Stat Assoc, 101, 1418-1429, (2006) · Zbl 1171.62326
[22] Zou, H; Hastie, T, Regularization and variable selection via the elastic net, J R Stat Soc Ser B, 67, 301-320, (2005) · Zbl 1069.62054
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.