k-means++ swMATH ID: 21622 Software Authors: D. Arthur, S. Vassilvitskii Description: k-means++: The advantages of careful seeding. The k-means method is a widely used clustering technique that seeks to minimize the average squared distance between points in the same cluster. Although it offers no accuracy guarantees, its simplicity and speed are very appealing in practice. By augmenting k-means with a very simple, randomized seeding technique, we obtain an algorithm that is Θ(logk)-competitive with the optimal clustering. Preliminary experiments show that our augmentation improves both the speed and the accuracy of k-means, often quite dramatically. Homepage: https://dl.acm.org/citation.cfm?id=1283383.1283494 Related Software: UCI-ml; AS 136; clusfind; PRMLT; Silhouettes; Scikit; StreamKM++; apcluster; APCluster; R; GitHub; J-MEANS; PMTK; FLANN; MNIST; t-SNE; Adam; clusterpath; BSDS; ElemStatLearn Cited in: 195 Documents Standard Articles 1 Publication describing the Software, including 1 Publication in zbMATH Year \(k\)-means++: the advantages of careful seeding. Zbl 1302.68273Arthur, David; Vassilvitskii, Sergei 2007 all top 5 Cited by 513 Authors 5 Jaiswal, Ragesh 4 Li, Min 3 Bhattacharya, Anup Kumar 3 Sra, Suvrit 3 Xu, Dachuan 2 Ailon, Nir 2 Aloise, Daniel 2 Arthur, David 2 Bai, Ruibin 2 Barnhart, Katherine R. 2 Blanchard, Gilles 2 Bonald, Thomas 2 Brunsch, Tobias 2 Chan, Laiwan 2 Chen, Zhitang 2 Cucuringu, Mihai 2 Deng, Xiao-Tie 2 Deshpande, Amit 2 Dey, Tamal Krishna 2 Feldman, Dan 2 Gao, Yansong 2 Giraud, Christophe 2 Gribonval, Rémi 2 Hämäläinen, Joonas 2 Hewitt, Mike 2 Hosseini, Reshad 2 Hu, Shoubo 2 Jiang, Xiaoping 2 Kärkkäinen, Tommi 2 Kaufmann, Emilie 2 Kendall, Graham 2 Keriven, Nicolas 2 Kim, Seoung Bum 2 Kleiber, William 2 Kumar, Amit 2 Lelarge, Marc 2 Mahajan, Meena 2 Mladenović, Nenad 2 Nimbhorkar, Prajakta 2 Piccialli, Veronica 2 Pratap, Rameshwar 2 Röglin, Heiko 2 Rossi, Alfred 2 Schmidt, Melanie 2 Sidiropoulos, Anastasios 2 Sohler, Christian 2 Strohmer, Thomas 2 Sudoso, Antonio M. 2 Traonmilin, Yann 2 Tremblay, Nicolas 2 Varadarajan, Kasturi R. 2 Vassilvitskii, Sergei 2 Verzelen, Nicolas 2 Wiens, Ashton 2 Xie, Ting 2 Yu, Jaehong 2 Zhang, Dongmei 2 Zhang, Jie 1 Ab Rahman, Khairul Shakir 1 Agarwal, Manu 1 Ahmadian, Sara 1 Aksoy, Selim 1 Alata, Olivier 1 Alencar, Alisson S. C. 1 Allen, Genevera I. 1 Alvo, Mayer 1 Amblard, Pierre-Olivier 1 An, Qiang 1 Arbel, Julyan 1 Arı, Çağlar 1 Arıkan, Orhan 1 Asencio-Cortes, Gualberto 1 Avilés-Cruz, Carlos 1 Bach, Francis R. 1 Bagirov, Adil M. 1 Banerjee, Arindam 1 Barthelmé, Simon 1 Barwey, Shivam 1 Batet, Montserrat 1 Bazzi, Marya 1 Beckman, Paul G. 1 Benyó, Zoltán 1 Bera, Debajyoti 1 Bertsimas, Dimitris John 1 Beylkin, Gregory 1 Binétruy, Christophe 1 Bock, Stefan 1 Bonny, Talal 1 Boubekki, Ahcène 1 Boutalbi, Rafika 1 Boyd, Nicholas 1 Brandyberry, David R. 1 Brécheteau, Claire 1 Brefeld, Ulf 1 Brook, Bindi S. 1 Brunet-Saumard, Camille 1 Bui, Hung H. 1 Bunea, Florentina 1 Burgard, Jan Pablo 1 Canas, Guillermo D. ...and 413 more Authors all top 5 Cited in 74 Serials 10 Information Sciences 10 Machine Learning 10 Journal of Machine Learning Research (JMLR) 9 Theoretical Computer Science 8 Data Mining and Knowledge Discovery 6 SIAM Journal on Computing 6 Journal of Classification 6 Mathematical Problems in Engineering 5 Algorithmica 4 Computers & Operations Research 4 European Journal of Operational Research 4 Computational Statistics and Data Analysis 4 Statistics and Computing 3 Information Processing Letters 3 Neural Computation 3 Pattern Recognition 3 Mathematical Programming. Series A. Series B 3 Mathematical Statistics and Learning 2 Computer Methods in Applied Mechanics and Engineering 2 The Annals of Statistics 2 Discrete & Computational Geometry 2 Computational Geometry 2 SIAM Journal on Optimization 2 Applied and Computational Harmonic Analysis 2 International Journal of Computer Vision 2 Journal of Combinatorial Optimization 2 Quantum Information Processing 2 International Journal of Wavelets, Multiresolution and Information Processing 2 Advances in Data Analysis and Classification. ADAC 2 Electronic Journal of Statistics 2 Mathematical Geosciences 2 Algorithms 1 Journal of Computational Physics 1 Journal of Fluid Mechanics 1 ACM Transactions on Mathematical Software 1 Calcolo 1 Fuzzy Sets and Systems 1 Journal of Multivariate Analysis 1 Journal of Optimization Theory and Applications 1 Computer Aided Geometric Design 1 Information and Computation 1 Journal of Scientific Computing 1 Signal Processing 1 Annals of Operations Research 1 MSCS. Mathematical Structures in Computer Science 1 Communications in Statistics. Simulation and Computation 1 ETNA. Electronic Transactions on Numerical Analysis 1 INFORMS Journal on Computing 1 Soft Computing 1 Chaos 1 Communications in Nonlinear Science and Numerical Simulation 1 Combustion Theory and Modelling 1 Journal of High Energy Physics 1 Foundations of Computational Mathematics 1 Journal of Systems Science and Complexity 1 ASTIN Bulletin 1 North American Actuarial Journal 1 Statistical Methods and Applications 1 Journal of Statistical Mechanics: Theory and Experiment 1 Mathematical Biosciences and Engineering 1 Oberwolfach Reports 1 Journal of Industrial and Management Optimization 1 Computational & Mathematical Methods in Medicine 1 Statistical Analysis and Data Mining 1 Nonlinear Analysis. Hybrid Systems 1 The Annals of Applied Statistics 1 Journal of Computational and Graphical Statistics 1 Symmetry 1 Journal of Computational Geometry 1 Journal of Theoretical Biology 1 ISRN Biomathematics 1 SIAM/ASA Journal on Uncertainty Quantification 1 Proceedings of the Royal Society of London. A. Mathematical, Physical and Engineering Sciences 1 SIAM Journal on Mathematics of Data Science all top 5 Cited in 30 Fields 105 Statistics (62-XX) 91 Computer science (68-XX) 38 Operations research, mathematical programming (90-XX) 12 Numerical analysis (65-XX) 12 Game theory, economics, finance, and other social and behavioral sciences (91-XX) 9 Probability theory and stochastic processes (60-XX) 7 Systems theory; control (93-XX) 6 Combinatorics (05-XX) 6 Biology and other natural sciences (92-XX) 4 Information and communication theory, circuits (94-XX) 3 Quantum theory (81-XX) 2 General and overarching topics; collections (00-XX) 2 Linear and multilinear algebra; matrix theory (15-XX) 2 Ordinary differential equations (34-XX) 2 Harmonic analysis on Euclidean spaces (42-XX) 2 Functional analysis (46-XX) 2 Calculus of variations and optimal control; optimization (49-XX) 2 General topology (54-XX) 2 Mechanics of deformable solids (74-XX) 2 Fluid mechanics (76-XX) 2 Geophysics (86-XX) 1 Algebraic geometry (14-XX) 1 Several complex variables and analytic spaces (32-XX) 1 Partial differential equations (35-XX) 1 Dynamical systems and ergodic theory (37-XX) 1 Convex and discrete geometry (52-XX) 1 Differential geometry (53-XX) 1 Optics, electromagnetic theory (78-XX) 1 Classical thermodynamics, heat transfer (80-XX) 1 Relativity and gravitational theory (83-XX) Citations by Year