Convex biclustering. (English) Zbl 1366.62208

Summary: In the biclustering problem, we seek to simultaneously group observations and features. While biclustering has applications in a wide array of domains, ranging from text mining to collaborative filtering, the problem of identifying structure in high-dimensional genomic data motivates this work. In this context, biclustering enables us to identify subsets of genes that are co-expressed only within a subset of experimental conditions. We present a convex formulation of the biclustering problem that possesses a unique global minimizer and an iterative algorithm, COBRA, that is guaranteed to identify it. Our approach generates an entire solution path of possible biclusters as a single tuning parameter is varied. We also show how to reduce the problem of selecting this tuning parameter to solving a trivial modification of the convex biclustering problem. The key contributions of our work are its simplicity, interpretability, and algorithmic guarantees – features that arguably are lacking in the current alternative algorithms. We demonstrate the advantages of our approach, which includes stably and reproducibly identifying biclusterings, on simulated and real microarray data.


62P10 Applications of statistics to biology and medical sciences; meta analysis
62J07 Ridge regression; shrinkage estimators (Lasso)
62H30 Classification and discrimination; cluster analysis (statistical aspects)
90C25 Convex programming
92D20 Protein sequences, DNA sequences
Full Text: DOI arXiv


[1] Bauschke, A Dykstra-like algorithm for two monotone operators, Pacific Journal of Optimization 4 pp 383– (2008) · Zbl 1176.47051
[2] Bergmann, Iterative signature algorithm for the analysis of large-scale gene expression data, Physical Review E 67 pp 031902– (2003)
[3] Boyd, Distributed optimization and statistical learning via the alternating direction method of multipliers, Foundations and Trends in Machine Learning 3 pp 1– (2011) · Zbl 1229.90122
[4] Busygin, Biclustering in data mining, Computers and Operations Research 35 pp 2964– (2008) · Zbl 1144.68309
[5] Cheng, Proceedings of the 8th International Conference of Intelligent Systems and Molecular Biology pp 93– (2000)
[6] Chi, Splitting methods for convex clustering, Journal of Computational and Graphical Statistics 24 pp 994– (2015) · Zbl 1060.62028
[7] Coifman, Wavelets and Multiscale Analysis, Applied and Numerical Harmonic Analysis pp 161– (2011)
[8] Dhillon, Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining pp 269– (2001)
[9] Dykstra, An algorithm for restricted least squares regression, Journal of the American Statistical Association 78 pp 837– (1983) · Zbl 0535.62063
[10] Friedman, Pathwise coordinate optimization, Annals of Applied Statistics 1 pp 302– (2007) · Zbl 1378.90064
[11] Hahsler, Getting things in order: An introduction to the R Package seriation, Journal of Statistical Software 25 pp 1– (2008)
[12] Hastie, The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2009) · Zbl 1273.62005
[13] Hocking, Proceedings of the 28th International Conference on Machine Learning (ICML-11) pp 745– (2011)
[14] Hofmann, Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence pp 688– (1999)
[15] Hubert, Comparing partitions, Journal of Classification 2 pp 193– (1985) · Zbl 0587.62128
[16] Kluger, Spectral biclustering of microarray data: Coclustering genes and conditions, Genome Research 13 pp 703– (2003)
[17] Lange, Optimization transfer using surrogate objective functions (with discussion), Journal of Computational and Graphical Statistics 9 pp 1– (2000)
[18] Langfelder, Defining clusters from a hierarchical cluster tree: The Dynamic Tree Cut package for R, Bioinformatics 24 pp 719– (2008) · Zbl 05511533
[19] Lazzeroni, Plaid models for gene expression data, Statistica Sinica 12 pp 61– (2002) · Zbl 1004.62084
[20] Lee, Biclustering via sparse singular value decomposition, Biometrics 66 pp 1087– (2010) · Zbl 1233.62182
[21] Lindsten , F. Ohlsson , H. Ljung , L. 2011 Just relax and come clustering! A convexification of k-means clustering
[22] Madeira, Biclustering algorithms for biological data analysis: A survey, IEEE/ACM Transactions on Computational Biology and Bioinformatics 1 pp 24– (2004) · Zbl 05103330
[23] Mazumder, Spectral regularization algorithms for learning large incomplete matrices, Journal of Machine Learning Research 11 pp 2287– (2010) · Zbl 1242.68237
[24] Meilă, Comparing clusterings-An information based distance, Journal of Multivariate Analysis 98 pp 873– (2007) · Zbl 1298.91124
[25] Meinshausen, Lasso-type recovery of sparse representations for high-dimensional data, Annals of Statistics 37 pp 246– (2009) · Zbl 1155.62050
[26] Pelckmans , K. De Brabanter , J. Suykens , J. De Moor , B. 2005 Convex clustering shrinkage PASCAL Workshop on Statistics and Optimization of Clustering Workshop
[27] Rand, Objective criteria for the evaluation of clustering methods, Journal of the American Statistical Association 66 pp 846– (1971)
[28] Shabalin, Finding large average submatrices in high dimensional data, Annals of Applied Statistics 3 pp 985– (2009) · Zbl 1196.62087
[29] Sill, Robust biclustering by sparse singular value decomposition incorporating stability selection, Bioinformatics 27 pp 2089– (2011)
[30] Sørlie, Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications, Proceedings of the National Academy of Sciences 98 pp 10869– (2001)
[31] Sørlie, Repeated observation of breast tumor subtypes in independent gene expression data sets, Proceedings of the National Academy of Sciences 100 pp 8418– (2003)
[32] Tan, Sparse biclustering of transposable data, Journal of Computational and Graphical Statistics 23 pp 985– (2014)
[33] Tanay, Biclustering algorithms: A survey. Number 9 in Computer and Information Science Series (2005)
[34] Tibshirani, Regression shrinkage and selection via the Lasso, Journal of the Royal Statistical Society, Series B (Statistical Methodology) 58 pp 267– (1996) · Zbl 0850.62538
[35] Tibshirani, Sparsity and smoothness via the fused Lasso, Journal of the Royal Statistical Society, Series B (Statistical Methodology) 67 pp 91– (2005) · Zbl 1060.62049
[36] Tibshirani, The solution path of the generalized Lasso, Annals of Statistics 39 pp 1335– (2011) · Zbl 1234.62107
[37] Tothill, Novel molecular subtypes of serous and endometrioid ovarian cancer linked to clinical outcome, Clinical Cancer Research 14 pp 5198– (2008)
[38] Turner, Improved biclustering of microarray data demonstrated through systematic performance tests, Computational Statistics and Data Analysis 48 pp 235– (2005) · Zbl 1429.62267
[39] Witten, A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis, Biostatistics 10 pp 515– (2009)
[40] Wold, Cross-validatory estimation of the number of components in factor and principal components models, Technometrics 20 pp 397– (1978) · Zbl 0403.62032
[41] Wu, Coordinate descent algorithms for Lasso penalized regression, Annals of Applied Statistics 2 pp 224– (2008) · Zbl 1137.62045
[42] Zou, The adaptive lasso and its oracle properties, Journal of the American Statistical Association 101 pp 1418– (2006) · Zbl 1171.62326
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.