swMATH ID: 7279
Software Authors: David D. Lewis; Yiming Yang; Tony G. Rose; Fan Li
Description: RCV1: A New Benchmark Collection for Text Categorization Research. Reuters Corpus Volume I (RCV1) is an archive of over 800,000 manually categorized newswire stories recently made available by Reuters, Ltd. for research purposes. Use of this data for research on text categorization requires a detailed understanding of the real world constraints under which the data was produced. Drawing on interviews with Reuters personnel and access to Reuters documentation, we describe the coding policy and quality control procedures used in producing the RCV1 data, the intended semantics of the hierarchical category taxonomies, and the corrections necessary to remove errorful data. We refer to the original data as RCV1-v1, and the corrected data as RCV1-v2. We benchmark several widely used supervised learning methods on RCV1-v2, illustrating the collection’s properties, suggesting new directions for research, and providing baseline results for future studies. We make available detailed, per-category experimental results, as well as corrected versions of the category assignments and taxonomy structures, via online appendices.
Homepage: http://dl.acm.org/citation.cfm?id=1005345
Related Software: LIBSVM; UCI-ml; BoosTexter; L-BFGS; SGD-QN; LIBLINEAR; AdaGrad; Pegasos; OHSUMED; HOGWILD; Adam; ImageNet; word2vec; ML-KNN; MULAN; t-SNE; SVMlight; Saga; ElemStatLearn; GloVe
Cited in: 104 Documents
all top 5

Cited by 301 Authors

5 Lin, Chih-Jen
3 Bottou, Léon
3 Langford, John
3 Lin, Qihang
3 Yuan, Xiaotong
3 Zhang, Tong
2 Chang, Kai-Wei
2 Crammer, Koby
2 Drineas, Petros
2 Fürnkranz, Johannes
2 Hsieh, Cho-Jui
2 Huang, Yakui
2 Kuang, Da
2 Lebanon, Guy
2 Li, Lihong
2 Li, Ping
2 Liu, Hongwei
2 Nedić, Angelia
2 Park, Haesun
2 Schuster, Assaf
2 Shanbhag, Uday V.
2 Sharfman, Izchak
2 Song, Yangqiu
2 Xiao, Lin
2 Ye, Jieping
2 Yin, Wotao
2 Yousefian, Farzad
2 Yun, Sangwoon
1 Abe, Shigeo
1 Agarwal, Alekh
1 Arbabifard, Kamyar
1 Bach, Francis R.
1 Bahamonde, Antonio
1 Balakrishnan, Suhrid
1 Bashar, Md Abul
1 Basu, Sugato
1 Bayoudh, Ines
1 Bechet, Nicolas
1 Benites, Fernando
1 Berry, Michael W.
1 Bianchi, Pascal
1 Bontcheva, Kalina
1 Bordes, Antoine
1 Brinker, Klaus
1 Browne, Murray
1 Brucker, Florian
1 Buntine, Wray L.
1 Burkhardt, Sophie
1 Busygin, Stanislav
1 Cai, Hongmin
1 Cai, Linkun
1 Cen, Shicong
1 Cerri, Ricardo
1 Chambers, America
1 Chawla, Nitesh V.
1 Chen, Jianhui
1 Chen, Jiazhou
1 Cheng, Hong
1 Chow, Tommy W. S.
1 Cristianini, Nello
1 Cristofari, Andrea
1 Cunningham, Hamish
1 Curtis, Frank E.
1 Cuturi, Marco
1 Damerau, Fred J.
1 Daumé, Hal III
1 Davidson, Ian
1 De Santis, Marianna
1 De Tré, Guy
1 del Coz, Juan José
1 Deligiannakis, Antonios
1 Deng, Sucheng
1 Díez, Jorge
1 Diggavi, Suhas N.
1 Dillon, Joshua V.
1 Dimakis, Alexandros G.
1 Domeniconi, Carlotta
1 Drake, Barry L.
1 Dredze, Mark
1 Du, Lan
1 Du, Rundong
1 Duchi, John C.
1 Dudík, Miroslav
1 Duivesteijn, Wouter
1 Dvurechensky, Pavel E.
1 Elenberg, Ethan R.
1 Erhan, Dumitru
1 Fan, Rong-En
1 Fan, Yiwei
1 Fazel, Maryam
1 Fercoq, Olivier
1 Finley, Thomas
1 Flaounas, Ilias
1 Forman, George
1 Fountoulakis, Kimon
1 Gabrilovich, E.
1 Gallinari, Patrick
1 Galvan, Giulio
1 Gao, Hanning
1 Garofalakis, Minos
...and 201 more Authors

Citations by Year