Font Size: a A A

Research And Application Of Distributed Hybrid Recommendation Algorithm Based On Spark

Posted on:2019-05-16Degree:MasterType:Thesis
Country:ChinaCandidate:X W LiFull Text:PDF
GTID:2428330545490143Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the popularization of the Internet and the rapid development of information technology,the information data on the Internet is exponentially growing rapidly.In the face of complex and diverse mass data,it is difficult for users to rapidly extract data with strong demand.At present,more and more online users understand the advantages and disadvantages of goods through e-commerce platforms and forum commodity review information.The service provider can also better understand the user's needs by analyzing the evaluation data of the product,thereby improving the user's satisfaction with the product.In the massive book review data,how to recommend the books of interest to users based on the ratings and review information of the books is the focus of this article.In this paper,billions of book reviews from Douban are used as real data sources.Under the Spark distributed computing framework,a distributed hybrid recommendation algorithm for massive data is studied and implemented.First of all,this paper uses Naive Bayes classification algorithm to carry out sentiment analysis on the defective data set.After the Chinese text sentiment analysis,the score value is calculated and the repaired result is filled into the training data set.Second,under the framework of Spark distributed computing,the matrix decomposition-based ALS collaborative filtering algorithm is implemented in parallel.Based on this,an algorithm based on the similarity of user's book preference features is researched and improved.This algorithm calculates the similarity between users based on the diversity of data sets in this paper,and finds users that are most similar to a user.The weighted integration of the similar user's preference features and the preliminary recommendation results may be used when recommending so that the recommendation results are more accurate.Finally,the ALS-based collaborative filtering recommendation algorithm is combined with the similarity degree algorithm based on the user's book preference characteristics.The ALS-based collaborative filtering recommendation algorithm can build a recommendation matrix and generate a recommendation model based on the user's rating of books,based on the user's book preferences.The feature similarity algorithm calculates the user and book preferences that have the highest similarity with the current user,and performs a weighted integration of the book preference calculation result and the collaborative filtering recommendation result to obtain a more accurate recommendation result.Experiments show that the Spark-based distributed hybrid recommendation algorithm designed and implemented in this paper can not only improve the efficiency of recommendation model construction,but also improve the accuracy of recommendation,and also has relatively good scalability.
Keywords/Search Tags:distributed recommendation system, hybrid recommendation, Spark, text affective analysis, similarity algorithm
PDF Full Text Request
Related items