zbMATH — the first resource for mathematics

Knowledge derived from wikipedia for computing semantic relatedness. (English) Zbl 1182.68291
Summary: Wikipedia provides a semantic network for computing semantic relatedness in a more structured fashion than a search engine and with more coverage than WordNet. We present experiments on using Wikipedia for computing semantic relatedness and compare it to WordNet on various benchmarking datasets. Existing relatedness measures perform better using Wikipedia than a baseline given by Google counts, and we show that Wikipedia outperforms WordNet on some datasets. We also address the question whether and how Wikipedia can be integrated into NLP applications as a knowledge base. Including Wikipedia improves the performance of a machine learning based coreference resolution system, indicating that it represents a valuable resource for NLP applications. Finally, we show that our method can be easily used for languages other than English by computing semantic relatedness for a German dataset.

68T35 Theory of languages and software systems (knowledge-based systems, expert systems, etc.) for artificial intelligence
68T05 Learning and adaptive systems in artificial intelligence
68M10 Network design and communication in computer systems
LIBSVM; SVMTool; WordNet
Full Text: Link