Research On Improvement Of Recommendation Algorithm Based On Spark

Posted on:2020-08-22

Degree:Master

Type:Thesis

Country:China

Candidate:J Y Liu

Full Text:PDF

GTID:2428330590963515

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

Dramatic growth of interrelated data has been occured with the increasing integration of diverse fields with Internet.It is essentially important to recommend the personalized data to the customers who are interested in among huge amount of data.Although collaborative filtering recommendation algorithm has been widely applied in various fields,due to the bottleneck of single-machine iteration capability,under the environment of huge amounts of data,data sparseness and scalability of problems will be more prominent,which seriously affect the accuracy of the Slope One-Bi recommended algorithm.Spark platform can greatly improve the recommendation efficiency by the memory advantage to iterate the recommendation algorithm.With these objectives,we are aiming for improving of the recommendation algorithm and implementing Spark platform in parallel.The problems of low recommendation accuracy,slow iteration speed and high computational complexity of the original Slope One-Bi algorithm based on the Spark platform and related technology of Big data have been analyzed.The following studies were conducted:1.Canopy-k-medoids clustering algorithm was proposed on the big data platform in parallel.Canopy algorithm was firstly used to traverse the data set to obtain the number of corresponding clusters and the global center point.Then,the k-medoids algorithm is used to calculate the distance to each center point for partitioning,which can effectively improve the clustering effect.Finally,UCI data set is used to test the performance,and the acceleration ratio and expansion ratio are improved to a certain extent.Moreover,Compared with other three clustering algorithms,the clustering effect is the best.2.Clustering algorithm combining Canopy and k-medoids brought users with high degree of similarity together.Then,the nearest neighbor was searched dynamically in the clustering based on whether the similarity between users is greater or not.And Slope One-Bi algorithm is used for recommendation and prediction.Finally,parallelization is implemented on the big data Spark platform.In conclusion,based on the deficiency of Slope One algorithm and the bottleneck of stand-alone iteration to optimize.Dynamic k-nearest neighbor and Canopy-k-medoids clustering are added to improve the recommendation performance and reduce the MAE value.

Keywords/Search Tags:

Slope One-BI, k-medoids, Spark, parallelization

PDF Full Text Request

Related items

1	Research On K-medoids Clustering Algorithm Based On Spark
2	Research And Implementation Of Parallel Recommandation Algorithm Based On Spark
3	Research On Dynamic Recommendation Parallelization Algorithm Based On Clustering
4	Research And Implementation Of Classification Algorithm Parallelization Based On Spark
5	Improvement And Implementation Of Slope One Collaborative Recommendation Algorithm Based On Spark
6	The Design And Implementation Of Parallelization Of Canopy And FCM Clustering Algorithms On Spark Platform
7	The Parallelization And Optimization Of K-means Algorithm Based On Spark
8	Video Analysis Technology Based On SPARK
9	Research And Application Of Parallelization Optimization Of Spatial Clustering Algorithm Based On Spark
10	Research On Optimization Of Association Rule Apriori Algorithm And Its Parallelization Based On Spark