| The development of science and technology has witnessed an explosion of dataset.Instead,the huge data makes it difficult for users to grasp the information they are interested in.Recommendation system,which depends on the users' information,has become an effect way to filter the irrelevant items.However,as the traditional recommendation system is based on historical data in certain period(usually per day),it cannot ensure the real-time validity.Moreover,when dealing with the cold-start problem,the traditional recommendation system has low accuracy and brings out poor user experience.Aiming at the problems mentioned above,our thesis introduces rough set models on two universes into the recommendation system to improve the accuracy in cold-start.In addition,in pursuit of real-time validity,we build a real-time recommendation system in a distributed dataflow framework.To be more specific,firstly,in the rough set model,we use both the dataset of users and the dataset of products to learn about the preference.Normally,the users have a one-to-one mapping on the ratings of the products.With the rating,we classify the emotions towards the products into “like” and“dislike” via a trained baseline.Said differently,if the rating is above the baseline,we label it as “positive mapping” and vice versa.After that,we dropped off the negative mapping when extracting the preference rules,thus improving the accuracy in the cold-start.Secondly,we build a real-time recommendation system based on Flink framework.To solve the eigendecomposition of sparse matrix,we further propose a new gradient descent algorithm,which can self-adapt the weight in the distributed manner and show great performance in the convergence rate.Thirdly,from the perspective of software engineering,we have a full description of our new algorithm from several aspects,including system requirements,system design and modular design.In the end,we test the functionality and performance of the new algorithm,and show the reliability and robustness of our work. |