Meng, Xiangrui; Bradley, Joseph; Yavuz, Burak; Sparks, Evan; Venkataraman, Shivaram; Liu, Davies; Freeman, Jeremy; Tsai, Db; Amde, Manish; Owen, Sean; Xin, Doris; Xin, Reynold; Franklin, Michael J.; Zadeh, Reza; Zaharia, Matei; Talwalkar, Ameet MLlib: machine learning in Apache Spark. (English) Zbl 1360.68697 J. Mach. Learn. Res. 17, Paper No. 34, 7 p. (2016). Summary: Apache Spark is a popular open-source platform for large-scale data processing that is well-suited for iterative machine learning tasks. In this paper we present MLlib, Spark’s open-source distributed machine learning library. MLlib provides efficient functionality for a wide range of learning settings and includes several underlying statistical, optimization, and linear algebra primitives. Shipped with Spark, MLlib supports several languages and provides a high-level API that leverages Spark’s rich ecosystem to simplify the development of end-to-end machine learning pipelines. MLlib has experienced a rapid growth due to its vibrant open-source community of over 140 contributors, and includes extensive documentation to support further growth and to let users quickly get up to speed. Cited in 26 Documents MSC: 68T05 Learning and adaptive systems in artificial intelligence Software:Scikit; NumPy; MapReduce; Mahout; MLlib; Apache Spark; PLANET; MLbase; GraphX; Breeze; GitHub; pmml PDFBibTeX XMLCite \textit{X. Meng} et al., J. Mach. Learn. Res. 17, Paper No. 34, 7 p. (2016; Zbl 1360.68697) Full Text: arXiv Link