Font Size: a A A

Research On Improved Fuzzy C-means Algorithm For Website Analysis

Posted on:2018-09-12Degree:MasterType:Thesis
Country:ChinaCandidate:X B WuFull Text:PDF
GTID:2359330515489573Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
With the continuous development of clustering analysis technology,website analysis has become the focus of the current people's attention.Website log data has a powerful function and practicality,it can record the user's access behavior,and dig out the law of potential user's behavior with clustering algorithms,so that the site staff to timely adjust and optimize the web page structure.It has important significance to provide users with more comprehensive and personalized service.Therefore,this paper mainly focuses on the research of efficient clustering algorithm and web analytics application.Fuzzy C-means algorithm is one of the most widely used fuzzy clustering method.It introduces the concept of membership degree and is more suitable for application in website analysis.On the basis of summarizing the research of fuzzy C-means algorithm,this paper studies the shortcomings of it deeply,which is mainly difficult to determine the number of clusters and the greater impact of data-intensive distribution,and propose an improved fuzzy C-means algorithm.The main idea introduces the Canopy algorithm to generate the effective clustering number and the initial clustering center,to solve the difficult of determining the number of clusters,and local optimal solution caused by randomness of initial clustering center.Then the distance measurement method is changed from Euclidean distance to Mahalanobis distance,so that eliminate the impact of data-intensive distribution.Website analysis is website log analysis,this paper uses the improved fuzzy C-means algorithm to analyze the actual case.Firstly,t website log data is preprocessed by data cleaning,user identification and session recognition,we get the data of users access pages.Then,the improved fuzzy C-means algorithm is applied to user clustering and page clustering,and get the user groups with the same behavior and the demand and interest of the pages according to the results of cluster analysis.Finally,the traditional fuzzy C-means algorithm is also applied to page clustering.By comparing and analyzing the results of page clustering analysis,the validity and correctness of the improved fuzzy C-means algorithm are verified.
Keywords/Search Tags:Website analysis, Fuzzy C-means, User clustering, Page clustering
PDF Full Text Request
Related items