×

zbMATH — the first resource for mathematics

BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. (English) Zbl 1270.68299
Summary: We present an automatic approach to the construction of BabelNet, a very large, wide-coverage multilingual semantic network. Key to our approach is the integration of lexicographic and encyclopedic knowledge from WordNet and Wikipedia. In addition, machine translation is applied to enrich the resource with lexical information for all languages. We first conduct in vitro experiments on new and existing gold-standard datasets to show the high quality and coverage of BabelNet. We then show that our lexical resource can be used successfully to perform both monolingual and cross-lingual word sense disambiguation: thanks to its wide lexical coverage and novel semantic relations, we are able to achieve state-of the-art results on three different SemEval evaluation tasks.

MSC:
68T30 Knowledge representation
68T50 Natural language processing
PDF BibTeX XML Cite
Full Text: DOI Link
References:
[1] S. Abney, S. Bird, The human language project: Building a universal corpus of the worldʼs languages, in: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, 11-16 July 2010, pp. 88-97.
[2] S.F. Adafre, M. de Rijke, Finding similar sentences across multiple languages in Wikipedia, in: Proceedings of the EACL-06 Workshop on New Text - Wikis and Blogs and Other Dynamic Text Sources, Trento, Italy, 4 April 2006.
[3] E. Adar, M. Skinner, D.S. Weld, Information arbitrage across multi-lingual Wikipedia, in: Proceedings of the Second ACM International Conference on Web Search and Data Mining, Barcelona, Spain, 9-12 February 2009, pp. 94-103.
[4] E. Agirre, O.L. de Lacalle, A. Soroa, Knowledge-based WSD on specific domains: Performing better than generic supervised WSD, in: Proceedings of the 21st International Joint Conference on Artificial Intelligence, Pasadena, CA, 14-17 July 2009, pp. 1501-1506.
[5] E. Agirre, A. Soroa, Personalizing PageRank for Word Sense Disambiguation, in: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, Athens, Greece, 30 March-3 April 2009, pp. 33-41.
[6] L. von Ahn, M. Kedia, M. Blum, Verbosity: A game for collecting common-sense facts, in: Proceedings of the Conference on Human Factors in Computing Systems, Montréal, Québec, Canada, 22-27 April 2006, pp. 75-78.
[7] J. Atserias, L. Villarejo, G. Rigau, E. Agirre, J. Carroll, B. Magnini, P. Vossen, The MEANING multilingual central repository, in: Proceedings of the 2nd International Global WordNet Conference, Brno, Czech Republic, 20-23 January 2004, pp. 80-210.
[8] S. Banerjee, T. Pedersen, Extended gloss overlap as a measure of semantic relatedness, in: Proceedings of the 18th International Joint Conference on Artificial Intelligence, Acapulco, Mexico, 9-15 August 2003, pp. 805-810.
[9] M. Banko, M.J. Cafarella, S. Soderland, M. Broadhead, O. Etzioni, Open information extraction from the Web, in: Proceedings of the 20th International Joint Conference on Artificial Intelligence, Hyderabad, India, 6-12 January 2007, pp. 2670-2676.
[10] A. Barrón-Cedeño, P. Rosso, E. Agirre, G. Labaka, Plagiarism detection across distant language pairs, in: Proceedings of the 23rd International Conference on Computational Linguistics, Beijing, China, 23-27 August 2010, pp. 37-45.
[11] Bateman, J.A.; Hois, J.; Ross, R.; Tenbrink, T., A linguistic ontology of space for natural language processing, Artificial intelligence, 174, 1027-1071, (2010)
[12] Bizer, C.; Lehmann, J.; Kobilarov, G.; Auer, S.; Becker, C.; Cyganiak, R.; Hellmann, S., Dbpedia - A crystallization point for the web of data, Journal of web semantics, 7, 154-165, (2009)
[13] W. Black, S.E.H. Rodriguez, M. Alkhalifa, P. Vossen, A. Pease, Introducing the Arabic WordNet project, in: Proceedings of the 3rd International Global WordNet Conference, Jeju Island, South Korea, 22-26 January 2006 pp. 295-299.
[14] Brin, S.; Page, L., The anatomy of a large-scale hypertextual web search engine, Computer networks and ISDN systems, 30, 107-117, (1998)
[15] P.F. Brown, S.A. Della Pietra, V.J. Della Pietra, R.L. Mercer, Word-sense disambiguation using statistical methods, in: Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics, Berkeley, CA, 18-21 June 1991, pp. 264-270.
[16] R. Bunescu, M. Paşca, Using encyclopedic knowledge for named entity disambiguation, in: Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics, Trento, Italy, 3-7 April 2006, pp. 9-16.
[17] Carletta, J., Assessing agreement on classification tasks: the kappa statistic, Computational linguistics, 22, 249-254, (1996)
[18] Y.S. Chan, H.T. Ng, Z. Zhong, NUS-ML: Exploiting parallel texts for word sense disambiguation in the English all-words tasks, in: Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval-2007), Prague, Czech Republic, 23-24 June 2007, pp. 253-256.
[19] P. Chen, W. Ding, C. Bowes, D. Brown, A fully unsupervised Word Sense Disambiguation method using dependency knowledge, in: Proceedings of Human Language Technologies 2009: The Conference of the North American Chapter of the Association for Computational Linguistics, Boulder, CO, 31 May-5 June 2009, pp. 28-36.
[20] T. Chklovski, R. Mihalcea, Building a sense tagged corpus with Open Mind Word Expert, in: Proceedings of the ACL-02 Workshop on WSD: Recent Successes and Future Directions, Philadelphia, PA, July 2002, pp. 116-122.
[21] M. Chodorow, R. Byrd, G.E. Heidorn, Extracting semantic hierarchies from a large on-line dictionary, in: Proceedings of the 23th Annual Meeting of the Association for Computational Linguistics, Chicago, IL, 8-12 July 1985, pp. 299-304.
[22] J. Christensen Mausam, O. Etzioni, A rose is a roos is a ruusu: Querying translations for web image search, in: Proceedings of the Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Singapore, 2-7 July 2009, pp. 193-196.
[23] P. Cimiano, S. Handschuh, S. Staab, Towards the self-annotating Web, in: Proceedings of the 13th World Wide Web Conference, New York, NY, 17-22 May 2004, pp. 462-471.
[24] P. Cimiano, A. Schultz, S. Sizov, P. Sorg, S. Staab, Explicit vs. latent concept models for cross-language information retrieval, in: Proceedings of the 21st International Joint Conference on Artificial Intelligence, Pasadena, CA, 14-17 July 2009, pp. 1513-1518.
[25] Copestake, A.; Briscoe, E.J.; Vossen, P.; Ageno, A.; Castellón, I.; Francesc, R.; Rigau, G.; Rodríguez, H.; Samiotou, A., Acquisition of lexical translation relations from mrds, Machine translation: special issue on the lexicon, 9, 33-69, (1995)
[26] M. Cuadros, G. Rigau, Quality assessment of large scale knowledge resources, in: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, 22-23 July 2006, pp. 534-541.
[27] M. Cuadros, G. Rigau, Semeval-2007 task 16: Evaluation of wide coverage knowledge resources, in: Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval-2007), Prague, Czech Republic, 23-24 June 2007, pp. 81-86.
[28] M. Cuadros, G. Rigau, KnowNet: building a large net of knowledge from the Web, in: Proceedings of the 22nd International Conference on Computational Linguistics, Manchester, UK, 18-22 August 2008, pp. 161-168.
[29] M. Diab, Word Sense disambiguation within a multilingual framework, PhD thesis, University of Maryland, College Park, Maryland, 2003.
[30] Downey, D.; Etzioni, O.; Soderland, S., Analysis of a probabilistic model of redundancy in unsupervised information extraction, Artificial intelligence, 174, 726-748, (2010) · Zbl 1205.68447
[31] P. Edmonds, Designing a task for SENSEVAL-2, Technical report, University of Brighton, UK, 2000.
[32] Etzioni, O.; Cafarella, M.; Downey, D.; Popescu, A.M.; Shaked, T.; Soderland, S.; Weld, D.; Yates, A., Unsupervised named-entity extraction from the web: an experimental study, Artificial intelligence, 165, 91-134, (2005)
[33] O. Etzioni, K. Reiter, S. Soderland, M. Sammer, Lexical translation with application to image search on the Web, in: Proceedings of Machine Translation Summit XI.
[34] A. Fader, S. Soderland, O. Etzioni, Identifying relations for Open Information Extraction, in: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP 2011), 27-31 July 2011, Edinburgh, UK, pp. 1535-1545.
[35] S. Faralli, R. Navigli, A new minimally-supervised framework for domain Word Sense Disambiguation, in: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Jeju, South Korea, 2012, pp. 1411-1422.
[36] Farreres, J.; Gibert, K.; Rodríguez, H.; Pluempitiwiriyawe, C., Inference of lexical ontologies. the leoni methodology, Artificial intelligence, 174, 1-19, (2010) · Zbl 1185.68764
[37] ()
[38] Flati, T.; Navigli, R., The CQC algorithm: cycling in graphs to semantically enrich and enhance a bilingual dictionary, Journal of artificial intelligence research (JAIR), 43, 135-171, (2012) · Zbl 1237.68233
[39] P. Fung, A pattern matching method for finding noun and proper noun translations from noisy parallel corpora, in: Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, Cambridge, MA, 26-30 June 1995, pp. 236-243.
[40] Gabrilovich, E.; Markovitch, S., Wikipedia-based semantic interpretation for natural language processing, Journal of artificial intelligence research (JAIR), 34, 443-498, (2009) · Zbl 1182.68319
[41] W.A. Gale, K. Church, D. Yarowsky, Using bilingual materials to develop Word Sense Disambiguation methods, in: Proceedings of the Fourth International Conference on Theoretical and Methodological Issues in Machine Translation, Montreal, Canada, 25-27 June 1992, pp. 101-112.
[42] Gale, W.A.; Church, K.W., A program for aligning sentences in bilingual corpora, Computational linguistics, 19, 75-102, (1993)
[43] Girju, R.; Badulescu, A.; Moldovan, D., Automatic discovery of part-whole relations, Computational linguistics, 32, 83-135, (2006)
[44] M. van Gompel, UvT-WSD1: A cross-lingual word sense disambiguation system, in: Proceedings of the 5th International Workshop on Semantic Evaluations (SemEval-2010), Uppsala, Sweden, 15-16 July 2010, pp. 238-241.
[45] W. Guo, M. Diab, COLEPL and COLSLM: An unsupervised WSD approach to multilingual lexical substitution, tasks 2 and 3 SemEval 2010, in: Proceedings of the 5th International Workshop on Semantic Evaluations (SemEval-2010), Uppsala, Sweden, 15-16 July 2010, pp. 129-133.
[46] A. Haghighi, P. Liang, T. Berg-Kirkpatrick, D. Klein, Learning bilingual lexicons from monolingual corpora, in: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Columbus, Ohio, 15-20 June 2008, pp. 771-779.
[47] S.M. Harabagiu, G.A. Miller, D.I. Moldovan, WordNet 2 - a morphologically and semantically enhanced resource, in: Proceedings of the SIGLEX99 Workshop on Standardizing Lexical Resources, 1999, pp. 1-8.
[48] S.M. Harabagiu, D. Moldovan, M. Paşca, R. Mihalcea, M. Surdeanu, R. Bunescu, R. Girju, V. Rus, P. Morarescu, FALCON: Boosting knowledge for answer engines, in: Proceedings of the Ninth Text REtrieval Conference, Gaithersburg, Maryland, November 15-20, 2000, pp. 479-488.
[49] M.A. Hearst, Automatic acquisition of hyponyms from large text corpora, in: Proceedings of the 15th International Conference on Computational Linguistics, Nantes, France, 23-28 August 1992, pp. 539-545.
[50] J. Hoffart, F.M. Suchanek, K. Berberich, E. Lewis-Kelham, G. de Melo, G. Weikum, YAGO2: Exploring and querying world knowledge in time, space, context, and many languages, in: Proceedings of the 20th World Wide Web Conference, Hyderabad, India, 28 March-25 April 2011, pp. 229-232.
[51] E. Hovy, M. Marcus, M. Palmer, L. Ramshaw, R. Weischedel, Ontonotes: The 90% solution, in: Companion Volume to the Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, New York, NY, 4-9 June 2006, pp. 57-60.
[52] Ide, N., Cross-lingual sense determination: can it work?, Computers and the humanities, 34, 223-234, (2000)
[53] N. Ide, T. Erjavec, D. Tufiş, Sense discrimination with parallel corpora, in: Proceedings of the ACL-02 Workshop on WSD: Recent Successes and Future Directions, Philadelphia, PA, July 2002, pp. 54-60.
[54] M. Ito, K. Nakayama, T. Hara, S. Nishio, Association thesaurus construction methods based on link co-occurrence analysis for Wikipedia, in: Proceedings of the Seventeenth ACM Conference on Information and Knowledge Management, Napa Valley, CA, 26-30 October 2008, pp. 817-826.
[55] Kilgarriff, A., Googleology is bad science, Computational linguistics, 33, 147-151, (2007)
[56] P. Koehn, Europarl: A parallel corpus for statistical machine translation, in: Proceedings of Machine Translation Summit X, Phuket, Thailand, 2005, pp. 79-86.
[57] P. Koehn, H. Hoang, A. Birch, C. Callison-Burch, M. Federico, N. Bertoldi, B. Cowan, W. Shen, C. Moran, R. Zens, C. Dyer, O. Bojar, A. Constantin, E. Herbst, Moses: open source toolkit for statistical machine translation, in: Companion Volume to the Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, 23-30 June 2007, pp. 177-180.
[58] P. Koehn, K. Knight, Learning a translation lexicon from monolingual corpora, in: Proceedings of the ACL-02 Workshop on Unsupervised Lexical Acquisition, Philadelphia, PA, July 2002, pp. 9-16.
[59] R. Koeling, D. McCarthy, Sussx: WSD using automatically acquired predominant senses, in: Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval-2007), Prague, Czech Republic, 23-24 June 2007 pp. 314-317.
[60] E. Lefever, V. Hoste, SemEval-2010 Task 3: Cross-lingual Word Sense Disambiguation, in: Proceedings of the 5th International Workshop on Semantic Evaluations (SemEval-2010), Uppsala, Sweden, 15-16 July 2010, pp. 15-20.
[61] E. Lefever, V. Hoste, M.D. Cock, ParaSense or how to use parallel corpora for Word Sense Disambiguation, in: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, Portland, OR, 19-24 June 2011, pp. 317-322.
[62] L. Lemnitzer, C. Kunze, GermaNet - representation, visualization, application, in: Proceedings of the 3rd International Conference on Language Resources and Evaluation, Las Palmas, Canary Islands, Spain, 29-31 May 2002, pp. 1485-1491.
[63] Lenat, D.B., Cyc: A large-scale investment in knowledge infrastructure, Communications of the ACM, 38, (1995)
[64] Lenci, A.; Bel, N.; Busa, F.; Calzolari, N.; Gola, E.; Monachini, M.; Ogonowski, A.; Peters, I.; Peters, W.; Ruimy, N.; Villegas, M.; Zampolli, A., SIMPLE: A general framework for the development of multilingual lexicons, International journal of lexicography, 13, 249-263, (2000)
[65] C.Y. Lin, E. Hovy, The automated acquisition of topic signatures for text summarization, in: Proceedings of the 18th International Conference on Computational Linguistics, Saarbrücken, Germany, 31 July-4 August 2000, pp. 495-501.
[66] L.V. Lita, W.A. Hunt, E. Nyberg, Resource analysis for question answering, in: Companion Volume to the Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, Barcelona, Spain, 21-26 July 2004, pp. 162-165.
[67] Mausam, S. Soderland, O. Etzioni, D. Weld, M. Skinner, J. Bilmes, Compiling a massive, multilingual dictionary via probabilistic inference, in: Proceedings of the Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Singapore, 2-7 July 2009, pp. 262-270.
[68] McCarthy, D.; Navigli, R., The English lexical substitution task, Language resources and evaluation, 43, 139-159, (2009)
[69] Medelyan, O.; Milne, D.; Legg, C.; Witten, I.H., Mining meaning from wikipedia, International journal of human-computer studies, 67, 716-754, (2009)
[70] D. Melamed, A word-to-word model of translational equivalence, in: Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, CA, 24-27 June 1996, pp. 238-241.
[71] G. de Melo, G. Weikum, Towards a universal wordnet by learning from combined evidence, in: Proceedings of the Eighteenth ACM Conference on Information and Knowledge Management, Hong Kong, China, 2-6 November 2009, pp. 513-522.
[72] G. de Melo, G. Weikum, MENTA: Inducing multilingual taxonomies from Wikipedia, in: Proceedings of the Nineteenth ACM Conference on Information and, Knowledge Management, Toronto, Canada, 26-30 October 2010, pp. 1099-1108.
[73] R. Mihalcea, Unsupervised large-vocabulary Word Sense Disambiguation with graph-based algorithms for sequence data labeling, in: Proceedings of the Human Language Technology Conference and the 2005 Conference on Empirical Methods in Natural Language Processing, Vancouver, BC, Canada, 6-8 October 2005, pp. 411-418.
[74] R. Mihalcea, T. Chklovski, A. Kilgarriff, The Senseval-3 English lexical sample task, in: Proceedings of the 3rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text (SENSEVAL-3) at ACL-04, Barcelona, Spain, 25-26 July 2004, pp. 25-28.
[75] R. Mihalcea, D. Moldovan, eXtended WordNet: Progress report, in: Proceedings of the NAACL-01 Workshop on WordNet and Other Lexical Resources, Pittsburgh, PA, June 2001, pp. 95-100.
[76] R. Mihalcea, P. Tarau, E. Figa, PageRank on semantic networks, with application to Word Sense Disambiguation, in: Proceedings of the 20th International Conference on Computational Linguistics, Geneva, Switzerland, 23-27 August 2004, pp. 1126-1132.
[77] Miller, G.A.; Beckwith, R.; Fellbaum, C.D.; Gross, D.; Miller, K., Wordnet: an online lexical database, International journal of lexicography, 3, 235-244, (1990)
[78] Miller, G.A.; Hristea, F., Wordnet nouns: classes and instances, Computational linguistics, 32, 1-3, (2006)
[79] G.A. Miller, C. Leacock, R. Tengi, R. Bunker, A semantic concordance, in: Proceedings of the 3rd DARPA Workshop on Human Language Technology, Plainsboro, NJ, 1993, pp. 303-308.
[80] D. Milne, I.H. Witten, An effective, low-cost measure of semantic relatedness obtained from Wikipedia links, in: Proceedings of the Workshop on Wikipedia and Artificial Intelligence: An Evolving Synergy at AAAI-08, Chicago, IL, 13 July, 2008, pp. 25-30.
[81] A. Moro, R. Navigli, WiSeNet: Building a Wikipedia-based semantic network with ontologized relations, in: Proceedings of the 21st ACM Conference on Information and Knowledge Management, Maui, Hawaii, 2012.
[82] V. Nastase, Topic-driven multi-document summarization with encyclopedic knowledge and activation spreading, in: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, Waikiki, Honolulu, Hawaii, 25-27 October 2008, pp. 763-772.
[83] V. Nastase, M. Strube, Decoding Wikipedia category names for knowledge acquisition, in: Proceedings of the 23rd Conference on the Advancement of Artificial Intelligence, Chicago, IL, 13-17 July 2008, pp. 1219-1224.
[84] V. Nastase, M. Strube, B. Börschinger, C. Zirn, A. Elghafari, WikiNet: A very large scale multi-lingual concept network, in: Proceedings of the 7th International Conference on Language Resources and Evaluation, Valletta, Malta, 19-21 May 2010.
[85] R. Navigli, Semi-automatic extension of large-scale linguistic knowledge bases, in: Proceedings of the 18th International Florida AI Research Symposium Conference, Clearwater Beach, FL, 15-17 May 2005, pp. 548-553.
[86] R. Navigli, Meaningful clustering of senses helps boost word sense disambiguation performance, in: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, 17-21 July 2006, Sydney, Australia, pp. 105-112.
[87] Navigli, R., Word sense disambiguation: A survey, ACM computing surveys, 41, 1-69, (2009)
[88] R. Navigli, S. Faralli, A. Soroa, O.L. de Lacalle, E. Agirre, Two birds with one stone: Learning semantic models for Text Categorization and Word Sense Disambiguation, in: Proceedings of the Twentieth ACM Conference on Information and Knowledge Management, Glasgow, Scotland, UK, 24-28 October 2011, pp. 2317-2320.
[89] Navigli, R.; Lapata, M., An experimental study on graph connectivity for unsupervised word sense disambiguation, IEEE transactions on pattern analysis and machine intelligence, 32, 678-692, (2010)
[90] R. Navigli, K.C. Litkowski, O. Hargraves, Semeval-2007 task 07: Coarse-grained English all-words task, in: Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval-2007), Prague, Czech Republic, 23-24 June 2007, pp. 30-35.
[91] R. Navigli, S.P. Ponzetto, BabelNet: Building a very large multilingual semantic network, in: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, 11-16 July 2010, pp. 216-225.
[92] R. Navigli, S.P. Ponzetto, BabelNetXplorer: a platform for multilingual lexical knowledge base access and exploration, in: Companion Volume to the Proceedings of the 21st World Wide Web Conference, Lyon, France, 16-20 April 2012, pp. 393-396.
[93] R. Navigli, S.P. Ponzetto, Multilingual WSD with just a few lines of code: the BabelNet API, in: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Jeju Island, South Korea, 8-14 July 2012, pp. 67-72.
[94] Navigli, R.; Velardi, P., Structural semantic interconnections: A knowledge-based approach to word sense disambiguation, IEEE transactions on pattern analysis and machine intelligence, 27, 1075-1088, (2005)
[95] R. Navigli, P. Velardi, Learning word-class lattices for definition and hypernym extraction, in: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, 11-16 July 2010, pp. 1318-1327.
[96] R. Navigli, P. Velardi, S. Faralli, A graph-based algorithm for inducing lexical taxonomies from scratch, in: Proceedings of the 22nd International Joint Conference on Artificial Intelligence, Barcelona, Spain, 16-22 July 2011, pp. 1872-1877.
[97] R. Navigli, P. Velardi, J.M. Ruiz-Martínez, An annotated dataset for extracting definitions and hypernyms from the web, in: Proceedings of the 7th International Conference on Language Resources and Evaluation, Valletta, Malta, 19-21 May 2010.
[98] H.T. Ng, H.B. Lee, Integrating multiple knowledge sources to disambiguate word senses: An exemplar-based approach, in: Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, CA, 24-27 June 1996, pp. 40-47.
[99] E. Niemann, I. Gurevych, The peopleʼs web meets linguistic knowledge: Automatic sense alignment of Wikipedia and WordNet, in: Proceedings of the 9th International Conference on Computational Semantics, Oxford, UK, pp. 205-214.
[100] I. Niles, A. Pease, Towards a standard upper ontology, in: Proceedings of the 2nd International Conference on Formal Ontology in Information Systems, Ogunquit, Maine, 17-19 October 2001, pp. 2-9.
[101] Och, F.J.; Ney, H., A systematic comparison of various statistical alignment models, Computational linguistics, 29, 19-51, (2003) · Zbl 1234.68428
[102] A. Pease, C. Fellbaum, P. Vossen, Building the global WordNet grid, in: Proceedings of the 18th International Congress of Linguists (CIL18), Seoul, South Korea, 21-26 July 2008.
[103] M. Pennacchiotti, P. Pantel, Ontologizing semantic relations, in: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, 17-21 July 2006, pp. 793-800.
[104] E. Pianta, L. Bentivogli, C. Girardi, MultiWordNet: Developing an aligned multilingual database, in: Proceedings of the 1st International Global WordNet Conference, Mysore, India, 21-25 January 2002, pp. 21-25.
[105] S.P. Ponzetto, R. Navigli, Large-scale taxonomy mapping for restructuring and integrating Wikipedia, in: Proceedings of the 21st International Joint Conference on Artificial Intelligence, Pasadena, CA, 14-17 July 2009, pp. 2083-2088.
[106] S.P. Ponzetto, R. Navigli, Knowledge-rich Word Sense Disambiguation rivaling supervised systems, in: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, 11-16 July 2010, pp. 1522-1531.
[107] Ponzetto, S.P.; Strube, M., Knowledge derived from wikipedia for computing semantic relatedness, Journal of artificial intelligence research, 30, 181-212, (2007) · Zbl 1182.68291
[108] Ponzetto, S.P.; Strube, M., Taxonomy induction based on a collaboratively built knowledge repository, Artificial intelligence, 175, 1737-1756, (2011)
[109] S. Pradhan, E. Loper, D. Dligach, M. Palmer, Semeval-2007 task-17: English lexical sample, SRL and all words, in: Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval-2007), Prague, Czech Republic, 23-24 June 2007, pp. 87-92.
[110] ()
[111] Rahman, A.; Ng, V., Narrowing the modeling gap: A cluster-ranking approach to coreference resolution, Journal of artificial intelligence research, 40, 469-521, (2011)
[112] Resnik, P.; Yarowsky, D., Distinguishing systems and distinguishing senses: new evaluation methods for word sense disambiguation, Journal of natural language engineering, 5, 113-133, (1999)
[113] M. Richardson, P. Domingos, Building large knowledge bases by mass collaboration, in: Proceedings of the 2nd International Conference on Knowledge Capture (K-CAP), Sanibel Island, FL, 23-25 October 2003, pp. 129-137.
[114] A.E. Richman, P. Schone, Mining wiki resources for multilingual named entity recognition, in: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Columbus, Ohio, 15-20 June 2008, pp. 1-9.
[115] G. Rigau, H. Rodríguez, E. Agirre, Building accurate semantic taxonomies from monolingual MRDs, in: Proceedings of the 17th International Conference on Computational Linguistics and 36th Annual Meeting of the Association for Computational Linguistics, Montréal, Québec, Canada, 10-14 August 1998, pp. 1103-1109.
[116] Roget, P.M., Rogetʼs international thesaurus, (1911), Cromwell New York, USA
[117] Ruiz-Casado, M.; Alfonseca, E.; Castells, P., Automatic assignment of wikipedia encyclopedic entries to wordnet synsets, (), 380-386
[118] B. Sagot, D. Fišer, Building a free French WordNet from multilingual resources, in: Proceedings of the Ontolex 2008 Workshop, Marrakech, Morocco, 31 May 2008.
[119] M. Sammer, S. Soderland, Building a sense-distinguished multilingual lexicon from monolingual corpora and bilingual lexicons, in: Proceedings of Machine Translation Summit XI, 2007.
[120] C. Sauper, R. Barzilay, Automatically generating Wikipedia articles: A structure-aware approach, in: Proceedings of the Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Singapore, 2-7 July 2009, pp. 208-216.
[121] C. Silberer, S.P. Ponzetto UHD, Cross-lingual Word Sense Disambiguation using multilingual co-occurrence graphs, in: Proceedings of the 5th International Workshop on Semantic Evaluations (SemEval-2010), Uppsala, Sweden, 15-16 July 2010, pp. 134-137.
[122] R. Snow, D. Jurafsky, A. Ng, Semantic taxonomy induction from heterogeneous evidence, in: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, 17-21 July 2006, pp. 801-808.
[123] F.M. Suchanek, G. Ifrim, G. Weikum, Combining linguistic and statistical analysis to extract relations from web documents, in: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA, 20-23 August 2006, pp. 712-717.
[124] Suchanek, F.M.; Kasneci, G.; Weikum, G., Yago: A large ontology from wikipedia and wordnet, Journal of web semantics, 6, 203-217, (2008)
[125] Taboada, M.; Brooke, J.; Tofiloski, M.; Voll, K.D.; Stede, M., Lexicon-based methods for sentiment analysis, Computational linguistics, 37, 267-307, (2011)
[126] A. Toral, O. Ferrández, E. Agirre, R. Muñoz, A study on linking Wikipedia categories to WordNet synsets using text similarity, in: Proceedings of the International Conference on Recent Advances in Natural Language Processing, Borovets, Bulgaria, 14-16 September 2009, pp. 449-454.
[127] Tufiş, D.; Cristea, D.; Stamou, S., Balkanet: aims, methods, results and perspectives. A general overview, Romanian journal on science and technology of information, 7, 9-43, (2004)
[128] ()
[129] Wan, X., Bilingual co-training for sentiment classification of Chinese product reviews, Computational linguistics, 37, 587-616, (2011)
[130] P. Wang, C. Domeniconi, Building semantic kernels for text classification using Wikipedia, in: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, 24-27 August 2008, pp. 713-721.
[131] W. Wentland, J. Knopp, C. Silberer, M. Hartung, Building a multilingual lexical resource for named entity disambiguation, translation and transliteration, in: Proceedings of the 6th International Conference on Language Resources and Evaluation, Marrakech, Morocco, 26 May-1 June 2008.
[132] K. Woodsend, M. Lapata, Learning to simplify sentences with quasi-synchronous grammar and integer programming, in: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, Edinburgh, Scotland, 27-29 July 2011, pp. 409-420.
[133] Wu, F.; Madhavan, J.; Halevy, A., Identifying aspects for web-search queries, Journal of artificial intelligence research, 40, 667-700, (2011)
[134] F. Wu, D. Weld, Automatically semantifying Wikipedia, in: Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, Lisbon, Portugal, 6-9 November 2007, pp. 41-50.
[135] F. Wu, D. Weld, Automatically refining the Wikipedia infobox ontology, in: Proceedings of the 17th World Wide Web Conference, Beijing, China, 21-25 April 2008, pp. 635-644.
[136] F. Wu, D. Weld, Open information extraction using Wikipedia, in: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweden, 11-16 July 2010, pp. 118-127.
[137] Yarowsky, D.; Florian, R., Evaluating sense disambiguation across diverse parameter spaces, Natural language engineering, 9, 293-310, (2002)
[138] Z. Ye, X. Huang, H. Lin, A graph-based approach to mining multilingual word associations from Wikipedia, in: Proceedings of the 32nd Annual International ACM Conference on Research and Development in Information Retrieval, Boston, MA, 19-23 July 2009, pp. 690-691.
[139] Yokoi, T., The EDR electronic dictionary, Communications of the ACM, 38, 42-44, (1995)
[140] Z. Zhong, H.T. Ng, Y.S. Chan, Word Sense Disambiguation using OntoNotes: An empirical study, in: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, Waikiki, Honolulu, Hawaii, 25-27 October, pp. 1002-1010.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.