SPRINT swMATH ID: 11760 Software Authors: Shafer CJ, Agrawal R, Mehta M Description: SPRINT: a scalable parallel classifier for data mining. Classification is an important data mining problem. Although classification is a well-studied problem, most of the current classification algorithms require that all or a portion of the the entire dataset remain permanently in memory. This limits their suitability for mining over large databases. We present a new decision-tree-based classification algorithm, called SPRINT that removes all of the memory restrictions, and is fast and scalable. The algorithm has also been designed to be easily parallelized, allowing many processors to work together to build a single consistent model. This parallelization, also presented here, exhibits excellent scalability as well. The combination of these characteristics makes the proposed algorithm an ideal tool for data mining. Homepage: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.40.2497&rep=rep1&type=pdf Related Software: SLIQ; C4.5; UCI-ml; RainForest; AutoClass; Fuzzy ARTMAP; PLANET; WEKA; gSpan; LOGML; Genocop; MLC++; Apache Spark; MLlib; MapReduce; SPSS; StreamKrimp; MOA; CudaRF; CUDA Cited in: 27 Documents all top 5 Cited by 70 Authors 2 Anagnostopoulos, Georgios C. 2 Bifet, Albert 2 DeMara, Ronald F. 2 Gama, João 2 Georgiopoulos, Michael 2 Gonzalez, Avelino J. 2 Nguyen, Hung Son 2 Secretan, Jimmy 1 Aggarwal, Charu C. 1 Amado, Nuno 1 Baek, Jun-Geol 1 Berthold, Michael R. 1 Berzal, Fernando 1 Bouchachia, Abdelhamid 1 Brain, Damien 1 Cheng, Chunhung 1 Cheung, David W. 1 Chow, Chi-Yin 1 Cubero, Juan-Carlos 1 Fillbrunn, Alexander 1 Freitas, Alex Alves 1 Fu, Ada Waichee 1 Ganti, Venkatesh 1 Gehrke, Johannes E. 1 Ghodsi, Mansi 1 Hamilton, Howard J. 1 Hassani, Hossein 1 He, Yulin 1 Hilderman, Robert J. 1 Hoch, Robert 1 Huang, Xu 1 Jin, Wen 1 Kim, Chang-Ouk 1 Kim, Sung Shick 1 King, Irwin 1 Kwiatkowski, Piotr 1 Lavington, Simon H. 1 Lazarevic, Aleksandar 1 Lee, Juhnyoung 1 Loh, Wei-Yin 1 Marín, Nicolás 1 Nguyen, Sinh Hoa 1 Obradovic, Zoran 1 Osei-Bryson, Kweku-Muata 1 Ou, Fang-Fang 1 Pechenizkiy, Mykola 1 Podlaseck, Mark 1 Popova, E. A. 1 Qian, Hailei 1 Qian, Weining 1 Ramakrishnan, Raghu 1 Rastogi, Rajeev 1 Sánchez, Daniel Eduardo 1 Schonberg, Edith 1 Shim, Kyuseok 1 Siebes, Arno P. J. M. 1 Silva, Emmanuel Sirimal 1 Silva, Fernando 1 Tang, Jian 1 Varghese, P. Paul 1 Wang, Jue 1 Wang, Lian 1 Wang, Ran 1 Webb, Geoffrey I. 1 Yiu, Siu-Ming 1 Zaki, Mohammed Javeed 1 Zhang, Jian 1 Zhao, Kai 1 Zhou, Aoying 1 Zliobaite, Indre all top 5 Cited in 18 Serials 3 Data Mining and Knowledge Discovery 2 Information Sciences 2 Journal of Computer Science and Technology 1 ACM Computing Surveys 1 Computers & Mathematics with Applications 1 Fuzzy Sets and Systems 1 Journal of Computer and System Sciences 1 Nonlinear Analysis. Theory, Methods & Applications. Series A: Theory and Methods 1 Moscow University Computational Mathematics and Cybernetics 1 International Journal of Production Research 1 Computers & Operations Research 1 Neural Networks 1 Machine Learning 1 Distributed and Parallel Databases 1 Frontiers in Artificial Intelligence and Applications 1 The Kluwer International Series in Engineering and Computer Science 1 The Kluwer International Series on Advances in Database Systems 1 Statistical Analysis and Data Mining Cited in 5 Fields 24 Computer science (68-XX) 2 Statistics (62-XX) 2 Operations research, mathematical programming (90-XX) 1 Numerical analysis (65-XX) 1 Biology and other natural sciences (92-XX) Citations by Year