zbMATH — the first resource for mathematics

Learning domain ontologies from document warehouses and dedicated web sites. (English) Zbl 1234.68373
Summary: We present a method and a tool, OntoLearn, aimed at the extraction of domain ontologies from Web sites, and more generally from documents shared among the members of virtual organizations. OntoLearn first extracts a domain terminology from available documents. Then, complex domain terms are semantically interpreted and arranged in a hierarchical fashion. Finally, a general-purpose ontology, WordNet, is trimmed and enriched with the detected domain concepts. The major novel aspect of this approach is semantic interpretation, that is, the association of a complex concept with a complex term. This involves finding the appropriate WordNet concept for each word of a terminological string and the appropriate conceptual relations that hold among the concept components. Semantic interpretation is based on a new word sense disambiguation algorithm, called structural semantic interconnections.

68T30 Knowledge representation
68M11 Internet topics
68T50 Natural language processing
Full Text: DOI
[1] DOI: 10.1016/0004-3702(95)00116-6
[2] DOI: 10.1162/089120102760275983 · Zbl 01937374
[3] DOI: 10.1109/5254.920602 · Zbl 05094546
[4] DOI: 10.1109/MIS.2003.1179190 · Zbl 05094269
[5] DOI: 10.1162/089120101300346787 · Zbl 01938096
[6] DOI: 10.1145/219717.219752 · Zbl 01936388
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.