Font Size: a A A

User Behavior And Interest Analysis Based On Click Recognition

Posted on:2019-04-12Degree:MasterType:Thesis
Country:ChinaCandidate:X Y LinFull Text:PDF
GTID:2348330545484504Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Rapid development of the Internet has changed all aspects of people's lives.Data on the Internet records users' behavior such as their hobbies,habits,and preferences.How to dig deeply into the data and make full use of it has become one of the hot research topics today.In this thesis,we first propose a real-time user click recognition method based on big data technology.Specifically,we construct the Referer graph of HTTP request,and propose our rules to the Referer graph.We then use Spark Streaming to identify the user click request.In order to further understand the internal structure of the network characteristics,we do a further statistical analysis and construct a bipartite graph of the recognition results.Finally,in order to better understand the user behavior,we apply the community discovery algorithm to the click recognition result with the definition of affinity measurement.The main contributions of this thesis include the following three points:First,Referer based click recognition method is simple,and it can solve the problem that different users have different click behaviors.However,after further analysis of the results of the click,we found that although the accuracy of the result is not so good.And then we propose our redirection recognition rule innovatively besides the other filtering rules,which effectively improved the accuracy of the recognition method.Second,despite the higher accuracy,previous user click recognition algorithm is difficult to apply in large-scale real-time environment.This thesis use Spark Streaming to achieve near real-time large-scale data processing capabilities on a massive data sets.Third,most of the previous community discovery algorithm are for one single large-scale website.Our thesis uses data from a college's total export traffic,thus we can do analysis on various website.After that,we construct an affinity graph with the affinity measurement based on user similarity and discover the most influential nodes.Thanks to the results,we can know more about how user behaves from different perspectives.
Keywords/Search Tags:internet traffic, user click identification, Spark Streaming, community discover
PDF Full Text Request
Related items