Learning to Improve Recommender Systems | | Posted on:2016-06-18 | Degree:Ph.D | Type:Thesis | | University:The Chinese University of Hong Kong (Hong Kong) | Candidate:Ling, Guang | Full Text:PDF | | GTID:2478390017981550 | Subject:Computer Science | | Abstract/Summary: | PDF Full Text Request | | With the rapid development of e-commerce websites, music and video streaming websites and social sharing websites, users are facing an explosion of choices nowadays. The presence of unprecedentedly large amount of choices leads to the information overload problem, which refers to the difficulty a user faces in understanding an issue and making decisions that are caused by the presence of too much information. Recommender systems learn users' preferences based on past behaviors and make suggestions for them. These systems are the key component to alleviate and solve the information overload problem. Encouraging progress has been achieved in the research of recommender systems from neighborhood-based methods to model-based methods. However, recommender systems employed today are far from perfect. In this thesis, we propose to improve the recommender systems from four perspectives motivated by real life problems.;First and foremost, we develop online algorithms for collaborative filtering methods, which are widely applicable to recommender systems. Traditionally batch-training algorithms are developed for collaborative filtering methods. They enjoy the advantage of easy to understand and simple to implement. However, the batch-training algorithms fail to consider the dynamic scenario where new users and new items join the system constantly. In order to make recommendations for these new users and on these new items, batch-training algorithms need to re-train the model from scratch. During the training process of batch-training algorithms, all the data have to be processed in each iteration. This is prohibitively slow given the sheer size of users and items faced by a real recommender system. Online learning algorithms can solve both of the problems by updating the model incrementally based on a rating point.;Secondly, we question an assumption made implicitly by most recommender systems. Most existing recommender systems assume that the rating distribution of collected ratings and that of the unobserved ratings are the same. Using data collected from a real life recommender system, we show that this assumption is unlikely to be true. By employing the powerful missing data theory, we develop a model that drops this unrealistic assumption and makes unbiased predictions.;Thirdly we examine the spam problem confronted by recommender systems. The ratings assigned by spam users contaminate the data of a recommender system and lead to deteriorated experience for normal users. We propose to use a reputation estimation system to keep track of users' reputations and identify spam users based on their reputations. We develop a unified framework for reputation estimation that subsumes a number of existing reputation estimation methods. Based on the framework, we also develop a matrix factorization based method that demonstrates outstanding discrimination ability.;Lastly, we integrate content-based filtering with collaborative filtering to alleviate the cold-start problem. The cold-start problem refers to the situation where a system has too little information concerning a user or an item to make accurate recommendations. With the readily available rich information embedded in review comments, which are generally discarded, we can alleviate the cold-start problem. Additionally, we can tag the black box collaborative filtering algorithm with interpretable tags that help a recommender system to provide reasons on why items are being recommended.;In summary, we solve some of the major problems faced by recommender systems and improve them from various perspectives in this thesis. Extensive experiments on real life large-scale datasets confirm the effectiveness and efficiency of proposed models. | | Keywords/Search Tags: | Recommender systems, Real life, Users, Collaborative filtering, Batch-training algorithms, Improve, Data, Develop | PDF Full Text Request | Related items |
| |
|