On the distribution, model selection properties and uniqueness of the Lasso estimator in low and high dimensions.(English)Zbl 1440.62060

Consider the linear model $$y=X\beta+\varepsilon$$, where $$y$$ is an $$n\times1$$ vector of observations, $$X$$ is an $$n\times p$$ regressor matrix, $$\beta\in\mathbb{R}^p$$ is an unknown parameter vector, and $$\varepsilon$$ is a normally distributed error term. In this setting, the weighted Lasso estimator is the solution to the minimization problem $\min_{\beta\in\mathbb{R}^p}\left\{\|y-X\beta\|^2+2\sum_{j=1}^p\lambda_{n,j}|\beta_j|\right\}\,,$ where the $$\lambda_{n,j}\geq0$$ are specified weights. The authors investigate properties of this estimator in both the low-dimensional ($$p\leq n$$) and high-dimensional ($$p>n$$) settings.
In the low-dimensional case, explicit expressions are given for the distribution function of the Lasso estimator and the corresponding density function conditional on knowing which components of the estimator are non-zero. The relationship between the Lasso estimator and the least-squares estimator is also investigated in terms of shrinkage sets: for any $$b\in\mathbb{R}^p$$, this is the set $$S(b)\subseteq\mathbb{R}^p$$ such that the Lasso estimator is equal to $$b$$ if and only if the least-squares estimator is in $$S(b)$$. Explicit expressions for these shrinkage sets are given.
In the high-dimensional setting, formulas for the distribution of the Lasso estimator are again given, and selection regions are investigated to relate the Lasso estimator to $$X^\prime y$$. For $$b\in\mathbb{R}^p$$, these are sets $$T(b)\subseteq\mathbb{R}^p$$ such that the Lasso estimator is equal to $$b$$ if and only if $$X^\prime y$$ lies in $$T(b)$$. The structural set is also constructed geometrically; this is the set of covariates that may be included in the Lasso solution for at least some values of $$y$$. Finally, a necessary and sufficient condition for uniqueness of the Lasso estimator is also given.
The authors make very few assumptions on the regressor matrix $$X$$, and many of their results continue to hold if the assumption of normally distributed errors is relaxed.

MSC:

 62E15 Exact distribution theory in statistics 62J05 Linear regression; mixed models 62J07 Ridge regression; shrinkage estimators (Lasso)
Full Text:

References:

 [1] Ali, A. and Tibshirani, R. J. (2019). The Generalized Lasso Problem and Uniqueness., Electronic Journal of Statistics 13 2307-2347. · Zbl 1473.62247 [2] Ewald, K. and Schneider, U. (2018). Uniformly Valid Confidence Sets Based on the Lasso., Electronic Journal of Statistics 12 1358-1387. · Zbl 1392.62079 [3] Ghaoui, L. E., Viallon, V. and Rabbani, T. (2012). Safe Feature Elimination in Sparse Supervised Learning., Pacific Journal of Optimization 8 667-698. · Zbl 1259.65010 [4] Jagannath, R. and Upadhye, N. S. (2018). The Lasso Estimator: Distributional Properties, Kybernetica (Prague) 54 778-797. · Zbl 1449.62162 [5] Knight, K. and Fu, W. (2000). Asymptotics of Lasso-Type Estimators., Annals of Statistics 28 1356-1378. · Zbl 1105.62357 [6] Lee, J. D., Sun, D. L., Sun, Y. and Taylor, J. E. (2016). Exact Post-Selection Inference with an Application to the Lasso., Annals of Statistics 44 907-927. · Zbl 1341.62061 [7] Ndiayee, E., Fercoq, O., Gramfort, A. and Salmon, J. (2017). Gap Safe Screening Rules for Sparsity Enforcing Penalties., Journal of Machine Learning Research 18 1-33. · Zbl 1442.62161 [8] Pötscher, B. M. and Leeb, H. (2009). On the Distribution of Penalized Maximum Likelihood Estimators: The LASSO, SCAD, and Thresholding., Journal of Multivariate Analysis 100 2065-2082. · Zbl 1170.62046 [9] Pötscher, B. M. and Schneider, U. (2009). On the Distribution of the Adaptive LASSO Estimator., Journal of Statistical Planning and Inference 139 2775-2790. · Zbl 1162.62063 [10] Sepehri, A. and Harris, N. (2017). The Accessible Lasso Models., Statistics 51 711-721. · Zbl 1440.62286 [11] Tibshirani, R. (1996). Regression Shrinkage and Selection via the Lasso., Journal of the Royal Statistical Society Series B 58 267-288. · Zbl 0850.62538 [12] Tibshirani, R. J. (2013). The Lasso Problem and Uniqueness., Electronic Journal of Statistics 7 1456-1490. · Zbl 1337.62173 [13] Tibshirani, R. J. and Taylor, J. (2012). Degrees of freedom in lasso problems., Annals of Statistics 40 1198-1232. · Zbl 1274.62469 [14] Tibshirani, R., Bien, J., Friedman, J., Hastie, T., Simon, N., Taylor, J. and Tibshirani, R. J. (2012). Strong Rules for Discarding Predictors in Lasso-Type Problems., Journal of the Royal Statistical Society Series B 74 245-266. · Zbl 1411.62213 [15] Zhou, Q. (2014). Monte Carlo Simulation for Lasso-Type Problems by Estimator Augmentation., Journal of the American Statistical Association 109 1495-1516. · Zbl 1368.62214
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.