swMATH ID: 11760
Software Authors: Shafer CJ, Agrawal R, Mehta M
Description: SPRINT: a scalable parallel classifier for data mining. Classification is an important data mining problem. Although classification is a well-studied problem, most of the current classification algorithms require that all or a portion of the the entire dataset remain permanently in memory. This limits their suitability for mining over large databases. We present a new decision-tree-based classification algorithm, called SPRINT that removes all of the memory restrictions, and is fast and scalable. The algorithm has also been designed to be easily parallelized, allowing many processors to work together to build a single consistent model. This parallelization, also presented here, exhibits excellent scalability as well. The combination of these characteristics makes the proposed algorithm an ideal tool for data mining.
Homepage: http://citeseerx.ist.psu.edu/viewdoc/download?doi=
Related Software: SLIQ; C4.5; UCI-ml; RainForest; AutoClass; Fuzzy ARTMAP; PLANET; WEKA; gSpan; LOGML; Genocop; MLC++; Apache Spark; MLlib; MapReduce; SPSS; StreamKrimp; MOA; CudaRF; CUDA
Cited in: 27 Documents

Citations by Year