Font Size: a A A

Analysis And Research Of E-commerce's User Data

Posted on:2018-01-01Degree:MasterType:Thesis
Country:ChinaCandidate:G L XieFull Text:PDF
GTID:2348330518493429Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
E-commerce industry widely applies recommend system technology to satisfy the users' individual requirements. To recommend better, two key problems should be addressed. The first one, is to extract effective and distinct main features. The general machine learning procedures often need to generate high dimension features from massive users' raw logs.However, overhigh dimension features will probably lead to some di-mensionality curse, such as increasing model training time or overfitting.So it's important to find a better method to remove the redundant and in-distinguishable features, ensuring each feature's effectiveness. The se-cond one, is to avoid cold start phenomenon in the recommend system. In the real scenario of E-commence, many users have little data logs, thus facing severe cold start issue when recommending for them which will result in low predicting accuracy. So, this kind of users should be dis-posed particularly in advance. In addition, the factorization machine(FM)algorithm treats all features equally, so fails to emphasize some crucial priori knowledge.Based on the above background, from the aspects of engineering and theory, this paper has proposed two methods to address the relevant problems. Firstly, propose a feature validation method based on message recall rate in different user types classified by the specific feature. This method is applied to extract useful features more effectively. Then, pro-pose a cascading connection recommend system using clustering algo-rithm, factorization machine and mixes the score with recommend score and the basic score from each new user's behavior. The major study con-tents and research aspects are as follows:Firstly, this paper has proposed a method to valid the features' clas-sification performance after extracting many features in the real E-commerce data sets, and conduct research on the users' behavior and products' attributes through statistical analysis theory. This method mainly applies the specific feature to orthogonally divide users, and then push SMS to different kinds of users, and finally evaluate the feature's effectiveness by real message recall rate. It reveals the impact of subse-quent purchase or loss behavior in every dimension. Experimental results show that we have indeed extracted the main effective features through this practical method.Then, to lower the RMSE in the score predicting problem, this paper has proposed a cascading connection recommend system. It includes of-fline subsystem and real-time subsystem. In the offline subsystem, build users' multidimensional features, mine out users' different behavior pat-terns using clustering algorithm, and produce new users' basescore using the features collected when they registered; Then, in the real-time sub-system, for each pattern of the users, predict their score respectively using the factor decomposition machine method calculating the correlation be-tween the users and the products, and mix the predict scores and the basescores as the final scores. The experimental results show that the method improves recommend accuracy and reduce recommend time.
Keywords/Search Tags:Big data, Cascading connection, K-means++, Factorization machine, Mix
PDF Full Text Request
Related items