LOGML: Log markup language for web usage mining. (English) Zbl 1048.68889
Kohavi, Ron (ed.) et al., WEBKDD 2001 - mining web log data across all customers touch points. 3rd international workshop, San Francisco, CA, USA, August 26, 2001. Revised papers. Berlin: Springer (ISBN 3-540-43969-2). Lect. Notes Comput. Sci. 2356, 88-112 (2002).
Summary: Web Usage Mining refers to the discovery of interesting information from user navigational behavior as stored in web access logs. While extracting simple information from web logs is easy, mining complex structural information is very challenging. Data cleaning and preparation constitute a very significant effort before mining can even be applied. We propose two new XML applications, XGMML and LOGML to help us in this task. XGMML is a graph description language and LOGML is a web-log report description language. We generate a web graph in XGMML format for a web site using the web robot of the WWWPal system. We generate web-log reports in LOGML format for a web site from web log files and the web graph. We further illustrate the usefulness of LOGML in web usage mining; we show the simplicity with which mining algorithms (for extracting increasingly complex frequent patterns) can be specified and implemented efficiently using LOGML.
68U99 Computing methodologies and applications
68U35 Computing methodologies for information systems (hypertext navigation, interfaces, decision support, etc.)
