zbMATH — the first resource for mathematics

Examples
Geometry Search for the term Geometry in any field. Queries are case-independent.
Funct* Wildcard queries are specified by * (e.g. functions, functorial, etc.). Otherwise the search is exact.
"Topological group" Phrases (multi-words) should be set in "straight quotation marks".
au: Bourbaki & ti: Algebra Search for author and title. The and-operator & is default and can be omitted.
Chebyshev | Tschebyscheff The or-operator | allows to search for Chebyshev or Tschebyscheff.
"Quasi* map*" py: 1989 The resulting documents have publication year 1989.
so: Eur* J* Mat* Soc* cc: 14 Search for publications in a particular source with a Mathematics Subject Classification code (cc) in 14.
"Partial diff* eq*" ! elliptic The not-operator ! eliminates all results containing the word elliptic.
dt: b & au: Hilbert The document type is set to books; alternatively: j for journal articles, a for book articles.
py: 2000-2015 cc: (94A | 11T) Number ranges are accepted. Terms can be grouped within (parentheses).
la: chinese Find documents in a given language. ISO 639-1 language codes can also be used.

Operators
a & b logic and
a | b logic or
!ab logic not
abc* right wildcard
"ab c" phrase
(ab c) parentheses
Fields
any anywhere an internal document identifier
au author, editor ai internal author identifier
ti title la language
so source ab review, abstract
py publication year rv reviewer
cc MSC code ut uncontrolled term
dt document type (j: journal article; b: book; a: book article)
Lasso-type recovery of sparse representations for high-dimensional data. (English) Zbl 1155.62050
Summary: The Lasso is an attractive technique for regularization and variable selection for high-dimensional data, where the number of predictor variables $p_n$ is potentially much larger than the number of samples $n$. However, it was recently discovered that the sparsity pattern of the Lasso estimator can only be asymptotically identical to the true sparsity pattern if the design matrix satisfies the so-called irrepresentable condition. The latter condition can easily be violated in the presence of highly correlated variables. We examine the behavior of the Lasso estimators if the irrepresentability condition is relaxed. Even though the Lasso cannot recover the correct sparsity pattern, we show that the estimator is still consistent in the $\ell_2$-norm sense for fixed designs under conditions on (a) the number $s_n$ of nonzero components of the vector $\beta_n$ and (b) the minimal singular values of design matrices that are induced by selecting small subsets of variables. Furthermore, a rate of convergence result is obtained on the $\ell_2$ error with an appropriate choice of the smoothing parameter. The rate is shown to be optimal under the condition of bounded maximal and minimal sparse eigenvalues. Our results imply that, with high probability, all important variables are selected. The set of selected variables is a meaningful reduction on the original set of variables. Finally, our results are illustrated with the detection of closely adjacent frequencies, a problem encountered in astrophysics.

MSC:
62J07Ridge regression; shrinkage estimators
62F12Asymptotic properties of parametric estimators
62G08Nonparametric regression
62F07Statistical ranking and selection procedures
62J05Linear regression
WorldCat.org
Full Text: DOI Euclid arXiv
References:
[1] Bickel, P., Ritov, Y. and Tsybakov, A. (2008). Simultaneous analysis of Lasso and Dantzig selector. Ann. Statist. · Zbl 1173.62022 · doi:10.1214/08-AOS620
[2] Bunea, B., Tsybakov, A. and Wegkamp, M. (2007). Aggregation for Gaussian regression. Ann. Statist. 35 1674-1697. · Zbl 1209.62065 · doi:10.1214/009053606000001587
[3] Bunea, F., Tsybakov, A. and Wegkamp, M. (2006). Sparsity oracle inequalities for the Lasso. Electron. J. Statist. 169-194. · Zbl 1146.62028 · doi:10.1214/07-EJS008
[4] Candes, E. and Tao, T. (2005a). Decoding by linear programming. IEEE Trans. Inform. Theory 51 4203-4215. · Zbl 1264.94121 · doi:10.1109/TIT.2005.858979
[5] Candes, E. and Tao, T. (2005b). The Dantzig selector: Statistical estimation when p is much larger than n . Ann. Statist. 35 2313-2351. · Zbl 1139.62019 · doi:10.1214/009053606000001523 · euclid:aos/1201012958
[6] Cornish, N. and Crowder, J. (2005). LISA data analysis using Markov chain Monte Carlo methods. Phys. Rev. D 72 43005.
[7] Davidson, K. and Szarek, S. (2001). Local operator theory, random matrices and Banach spaces. In Handbook on the Geometry of Banach Spaces 1 (W. B. Johnson and J. Lindenstrauss, eds.) 317-366. North -Holland, Amsterdam. · Zbl 1067.46008 · doi:10.1016/S1874-5849(01)80010-3
[8] Donoho, D. (2006). For most large underdetermined systems of linear equations, the minimal l 1 -norm solution is also the sparsest solution. Comm. Pure Appl. Math. 59 797-829. · Zbl 1113.15004 · doi:10.1002/cpa.20132
[9] Donoho, D. and Elad, M. (2003). Optimally sparse representation in general (nonorthogonal) dictionaries via \ell 1 -minimization. Proc. Natl. Acad. Sci. USA 100 2197-2202. · Zbl 1064.94011 · doi:10.1073/pnas.0437847100
[10] Donoho, D., Elad, M. and Temlyakov, V. (2006). Stable recovery of sparse overcomplete representations in the presence of noise. IEEE Trans. Inform. Theory 52 6-18. · Zbl 1288.94017 · doi:10.1109/TIT.2005.860430
[11] Efron, B., Hastie, T., Johnstone, I. and Tibshirani, R. (2004). Least angle regression. Ann. Statist. 32 407-451. · Zbl 1091.62054 · doi:10.1214/009053604000000067
[12] Fuchs, J. (2005). Recovery of exact sparse representations in the presence of bounded noise. IEEE Trans. Inform. Theory 51 3601-3608. · Zbl 1286.94031 · doi:10.1109/TIT.2005.855614
[13] Greenshtein, E. and Ritov, Y. (2004). Persistence in high-dimensional predictor selection and the virtue of over-parametrization. Bernoulli 10 971-988. · Zbl 1055.62078 · doi:10.3150/bj/1106314846
[14] Gribonval, R. and Nielsen, M. (2003). Sparse representations in unions of bases. IEEE Trans. Inform. Theory 49 3320-3325. · Zbl 1286.94032 · doi:10.1109/TIT.2003.820031
[15] Hall, P., Reimann, J. and Rice, J. (2000). Nonparametric estimation of a periodic function. Biometrika 87 545-557. JSTOR: · Zbl 0956.62031 · doi:10.1093/biomet/87.3.527 · http://links.jstor.org/sici?sici=0006-3444%28200009%2987%3A3%3C545%3ANEOAPF%3E2.0.CO%3B2-4&origin=euclid
[16] Hannan, E. and Quinn, B. (1989). The resolution of closely adjacent spectral lines. J. Time Ser. Anal. 10 13-31. · Zbl 0683.62051 · doi:10.1111/j.1467-9892.1989.tb00012.x
[17] Joshi, R., Crump, V. and Fischer, T. (1995). Image subband coding using arithmetic coded trellis codedquantization. IEEE Transactions on Circuits and Systems for Video Technology 5 515-523.
[18] Knight, K. and Fu, W. (2000). Asymptotics for Lasso-type estimators. Ann. Statist. 28 1356-1378. · Zbl 1105.62357 · doi:10.1214/aos/1015957397
[19] LoPresto, S., Ramchandran, K. and Orchard, M. (1997). Image coding based on mixture modeling of wavelet coefficients and a fast estimation-quantization framework. In Proc. Data Compression Conference 221-230.
[20] Mallat, S. (1989). A theory for multiresolution signal decomposition: The wavelet representation. IEEE Trans. Pattern Anal. Machine Intelligence 11 674-693. · Zbl 0709.94650 · doi:10.1109/34.192463
[21] Meier, L., van de Geer, S. and Bühlmann, P. (2008). The group Lasso for logistic regression. J. Roy. Statist. Soc. Ser. B 70 53-71. · Zbl 05563343
[22] Meinshausen, N. (2007). Relaxed Lasso. Comput. Statist. Data Anal. 52 374-393. · Zbl 05560166
[23] Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the Lasso. Ann. Statist. 34 1436-1462. · Zbl 1113.62082 · doi:10.1214/009053606000000281
[24] Meinshausen, N., Rocha, G. and Yu, B. (2007). A tale of three cousins: Lasso, L2Boosting and Dantzig. Ann. Statist. 35 2373-2384. · doi:10.1214/009053607000000460
[25] Osborne, M., Presnell, B. and Turlach, B. (2000). On the Lasso and its dual. J. Comput. Graph. Statistics 9 319-337. JSTOR: · doi:10.2307/1390657 · http://links.jstor.org/sici?sici=1061-8600%28200006%299%3A2%3C319%3AOTLAID%3E2.0.CO%3B2-M&origin=euclid
[26] Paul, D. (2007). Asymptotics of sample eigenstructure for a large-dimensional spiked covariance model. Statist. Sinica 17 1617-1642. · Zbl 1134.62029
[27] Pojmanski, G. (2002). The All Sky Automated Survey. Catalog of Variable Stars. I. 0 h-6 Quarter of the Southern Hemisphere. Acta Astronomica 52 397-427.
[28] Scargle, J. (1982). Studies in astronomical time series analysis. II. Statistical aspects of spectral analysis of unevenly spaced data. Astrophysical J. 263 835.
[29] Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. J. Roy. Statist. Soc. Ser. B 58 267-288. JSTOR: · Zbl 0850.62538 · http://links.jstor.org/sici?sici=0035-9246%281996%2958%3A1%3C267%3ARSASVT%3E2.0.CO%3B2-G&origin=euclid
[30] Tropp, J. (2004). Greed is good: Algorithmic results for sparse approximation. IEEE Trans. Inform. Theory 50 2231-2242. · Zbl 1288.94019 · doi:10.1109/TIT.2004.834793
[31] Tropp, J. (2006). Just relax: Convex programming methods for identifying sparse signals in noise. IEEE Trans. Inform. Theory 52 1030-1051. · Zbl 1288.94025 · doi:10.1109/TIT.2005.864420
[32] Umstätter, R., Christensen, N., Hendry, M., Meyer, R., Simha, V., Veitch, J., Vigeland, S. and Woan, G. (2005). LISA source confusion: Identification and characterization of signals. Classical and Quantum Gravity 22 901.
[33] Valdés-Sosa, P., Sánchez-Bornot, J., Lage-Castellanos, A., Vega-Hernández, M., Bosch-Bayard, J., Melie-García, L. and Canales-Rodríguez, E. (2005). Estimating brain functional connectivity with sparse multivariate autoregression. Philos. Trans. Roy. Soc. B: Biological Sciences 360 969-981.
[34] van de Geer, S. (2006). High-dimensional generalized linear models and the Lasso. Ann. Statist. 36 614-645. · Zbl 1138.62323 · doi:10.1214/009053607000000929
[35] Wainwright, M. (2006). Sharp thresholds for high-dimensional and noisy recovery of sparsity. Available at
[36] Yuan, M. and Lin, Y. (2006a). Model selection and estimation in regression with grouped variables. J. Roy. Statist. Soc. Ser. B 68 49-67. · Zbl 1141.62030 · doi:10.1111/j.1467-9868.2005.00532.x
[37] Yuan, M. and Lin, Y. (2006b). Model selection and estimation in regression with grouped variables. J. Roy. Statist. Soc. Ser. B 68 49-67. · Zbl 1141.62030 · doi:10.1111/j.1467-9868.2005.00532.x
[38] Zhang, C.-H. and Huang, J. (2006). The sparsity and bias of the Lasso selection in high-dimensional linear regression. Ann. Statist. 36 1567-1594. · Zbl 1142.62044 · doi:10.1214/07-AOS520
[39] Zhang, H. and Lu, W. (2007). Adaptive-Lasso for Cox’s proportional hazards model. Biometrika 94 691-703. · Zbl 1135.62083 · doi:10.1093/biomet/asm037
[40] Zhao, P. and Yu, B. (2004). Stagewise Lasso. J. Machine Learning Research 8 2701-2726. · Zbl 1222.68345 · http://www.jmlr.org/papers/v8/zhao07a.html
[41] Zhao, P. and Yu, B. (2006). On model selection consistency of Lasso. J. Machine Learning Research 7 2541-2563. · Zbl 1222.62008 · http://www.jmlr.org/papers/v7/zhao06a.html
[42] Zou, H. (2006). The adaptive Lasso and its oracle properties. J. Amer. Statist. Assoc. 101 1418-1429. · Zbl 1171.62326 · doi:10.1198/016214506000000735