| With the popularity of the Internet and the development of Internet technology, a wide variety of Internet applications are gradually penetrated into our daily life, quietly changing our way of life. Information on the Internet is complex and wide, and a wide variety of goods flooding our eyes when shopping online. How to help users to find their real needs of goods that has become an issue for researchers. Recommendation system came into being just in this context. It recommended products for users by their personal information and behavior. There are many recommendation algorithms, the most extensive and most successful one in the industry is collaborative filtering.With the explosive growth of Internet data, the traditional technology is faced with the challenge of big data, and the traditional recommendation system is plagued too: calculations are more time-consuming, the cold start problems and sparse matrix problems are harder to solve. Apache Hadoop, a top-level project, which is an open source distributed platform that more popular and mature currently. Map Reduce is a programming model for distributed computing proposed by Google, it splits a task into multiple sub-tasks, runs them on multiple server nodes, in order to enhance computing performance and data processing capability. Running Map Reduce jobs on Hadoop platform is a common way to process big data, and it also provides ideas for Big Data problems to the research and practice of recommendation system.In this paper, the work consists of two parts. The first part is the research and improvement of the recommendation algorithm. First, the usual recommendation algorithms were analyzed and studied, the key issues in recommendation were introduced, such as the way of experiment and evaluation, the cold start problem and the hybrid recommendation technology. Next, the item based collaborative filtering algorithm was focused on, as to the shortcomings of the traditional cosine similarity algorithm, an improved cosine similarity algorithm was proposed based on rating baseline value and rating date. Then, Map Reduce was applied to the traditional item based collaborative filtering recommendation algorithm, implementing a distributed algorithm. Finally, some experiments were conducted. By comparing the average error, it was proved that the improved cosine similarity algorithm improves the prediction accuracy. By running the algorithm on Hadoop and comparing time-consuming and other data, it was proved that the distributed item based collaborative filtering algorithm has a better performance than the traditional one when there is a large amounts of data. The second part of this work is the design and implement of a movie recommendation system. Overall system architecture includes a user model, a film model, a recommendation engine, an offline computing module, a data synchronization module, a data conversion module, a cache module and a user behavior gather and feedback module, etc. HTML5 was used on the Front-end of the system, and the system can be adapted to most devices including PC, tablet and mobile phone by using responsive design. Finally, functional tests were conducted on the realization of the system, each functional module was operating normally. |