Font Size: a A A

Web mining for pattern discovery in e-commerce applications

Posted on:2002-01-04Degree:M.SType:Dissertation
University:University of LouisvilleCandidate:Chatterjee, DebasishFull Text:PDF
GTID:1468390014450079Subject:Business Administration
Abstract/Summary:PDF Full Text Request
Web site design is currently based on thorough investigations about the interests of the web site visitors. User navigational paths are modeled as sequences of web page views that defines the user's preferences. Concrete knowledge on the way visitors navigate the web site, could prevent disorientation and help owners in placing important information exactly where the visitors look for it. Conventional sequence miners or navigational path analysis discover only frequent sequences. This limits the applicability and scope of sequence mining for Web usage analysis. This dissertation presents an algorithm and a tool iLINT (Internet Log Interpreter) that extracts paths that the users have taken while visiting the web site. It extracts sequences with a high support and sequences with a low support. The general problem addressed here is, given a log file consisting of user traversal entries, how to extract paths from it, which were taken most frequently, and also the paths, which were not taken so frequently. The iLINT algorithm and tool was developed to extract such knowledge from the Web Server logs. To efficiently discover interesting paths, a technique based on processing of navigational pattern containing sequences of web page views was used. A pattern is a sequential list of page-views ordered by transaction time and each transaction consists of a single pattern. At the end some experimental results and a comparison report of iLINT with few commercial tools and a well-known sequence-mining algorithm, GSP is given. It is shown that most commercial Web Usage Mining tools do not do a proper path analysis and can be improved by incorporating algorithms like iLINT. The iLINT algorithm was found to scale-up linearly with the number of data-sequences, and has very good scale-up properties with respect to the average data sequence size.
Keywords/Search Tags:Web, Pattern, Paths, Sequences, Mining
PDF Full Text Request
Related items