| With the rapid development of network technology and e-commerce, more and more information on the network flooded, users can not only obtain information, but also can release information, which makes the information resources on the network increasing more and more, resulting in "information overload" problem. Users enjoy the convenience of e-commerce, but often been plagued by a lot of product information. Personalized recommendation technology are used as an effective way to solve the problem of information overload. The collaborative filtering technology is one of the most widely used personalized recommendation technology, although achieved great success, but also is facing with serious challenges. Among them, the most difficult problem is the data sparsity. Data sparsity is the inevitable result with increasing of the number of user and item. And collaborative filtering algorithm is based on the user’s historical ratings. Therefore, data sparsity is an important factor which constraints the result of collaborative filtering algorithm in accuracy.This paper mainly focuses on data sparsity of collaborative filtering algorithms. The main content of this paper are as follows:1、Analyzes the shortcomings of traditional collaborative filtering algorithm, on the basis of predecessors’ research,a collaborative filtering algorithm based on rough sets and attribute importance is proposed. Firstly, filtered user rating item and then combines the Universes Nearest Neighbor,using Incomplete data filling algorithm—ROUSTIDA algorithm on the basis of the analysis of attribute importance to fill the original user-item rating matrix, which could reduce the inaccurate similarity in data sparsity.The experimental results show that the algorithm can improve the recommendation quality.2、Because of the traditional collaborative filtering recommendation technology is easily affected by the data sparsity, adding users preference for the item attribute, this paper proposes a hybrid collaborative filtering recommendation algorithm based on user preferences. First of all, analysis the user preferences of the item attribute, and then, when computing the similarity of two users, we combine the Improved cosine similarity and the user preference similarity, with a weighting coefficient "γ" to balance the importance of two parts. Experiments show that this algorithm effectively solves the problem of data sparsity, and has better accuracy compared to the traditional collaborative filtering algorithms when the sparsity is more serious. |