Sparsity and smoothness via the fused lasso. (English) Zbl 1060.62049

Summary: The lasso penalizes a least squares regression by the sum of the absolute values \((L_1\)-norm) of the coefficients. The form of this penalty encourages sparse solutions (with many coefficients equal to \(0)\). We propose the ‘fused lasso’, a generalization that is designed for problems with features that can be ordered in some meaningful way. The fused lasso penalizes the \(L_1\)-norm of both the coefficients and their successive differences. Thus it encourages sparsity of the coefficients and also sparsity of their differences – i.e., local constancy of the coefficient profile. The fused lasso is especially useful when the number of features \(p\) is much greater than \(N\), the sample size. The technique is also extended to the ‘hinge’ loss function that underlies the support vector classifier. We illustrate the methods on examples from protein mass spectroscopy and gene expression data.


62G08 Nonparametric regression and quantile regression
62P10 Applications of statistics to biology and medical sciences; meta analysis
62J05 Linear regression; mixed models


SQOPT; ElemStatLearn
Full Text: DOI


[1] Adam B.-L., Cancer Res. 63 pp 3609– (2003)
[2] Boser B., Proc. Computational Learning Theory II, Philadelphia (1992)
[3] DOI: 10.1137/S003614450037906X · Zbl 0979.94010 · doi:10.1137/S003614450037906X
[4] Donoho D., Biometrika 81 pp 425– (1994)
[5] Efron B., Technical Report (2002)
[6] C. Geyer (1996 ) On the asymptotics of convex stochastic optimization .Technical Report. University of Minnesota, Minneapolis.
[7] Gill P. E., Technical Report NA 97-4 (1997)
[8] DOI: 10.1126/science.286.5439.531 · doi:10.1126/science.286.5439.531
[9] Hastie T., The Elements of Statistical Learning; Data Mining, Inference and Prediction (2001) · Zbl 0973.62007
[10] Hoerl A. E., Technometrics 12 pp 55– (1970)
[11] Knight K., Ann. Statist. 28 pp 1356– (2000)
[12] Land S., Technical Report (1996)
[13] Lee Y., Technical Report (2002)
[14] DOI: 10.1016/S0140-6736(02)07746-2 · doi:10.1016/S0140-6736(02)07746-2
[15] S. Rosset, and J. Zhu (2003 ) Adaptable, efficient and robust methods for regression and classification via piecewise linear regularized coefficient paths . Stanford University, Stanford.
[16] Rosset S., J. Mach. Learn. Res. 5 pp 941– (2004)
[17] Stein C., Ann. Statist. 9 pp 1131– (1981)
[18] Tibshirani R., J. R. Statist. Soc. 58 pp 267– (1996)
[19] Tibshirani R., Proc. Natn. Acad. Sci. USA 99 pp 6567– (2001)
[20] R. Tibshirani, M. Saunders, S. Rosset, J. Zhu, and K. Knight (2004 ) Sparsity and smoothness via the fused lasso .Technical Report. Stanford University, Stanford. · Zbl 1060.62049
[21] Vapnik V., The Nature of Statistical Learning Theory (1996) · Zbl 0934.62009
[22] Wold H., Perspectives in Probability and Statistics, in Honor of M. S. Bartlett pp 117– (1975)
[23] Zhu J., Technical Report (2003)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.