Font Size: a A A

Research On Debiasing Methods For Data Bias In Recommendation Systems

Posted on:2024-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y QiuFull Text:PDF
GTID:2568306932455824Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,the amount of network information has exploded,and users are facing severe information overload problems.How to accurately predict user interests,overcome information overload,and achieve personalized recommendations for users has become a hot topic in the field of recommendation systems.Recommendation systems rely on users’ historical behavioral data(such as ratings and clicks)to build user personalized models.However,due to the fact that the collected data is observational rather than experimental,there are various biases in the collected data,which significantly affect the generalization performance of the learned model.Thanks to the rapid development of deep learning in recent years,the debiasing work of recommendation systems has made significant progress.However,existing debiasing models still have two serious shortcomings:1)Lack of generality:these methods aim to solve one or several biases in specific scenarios.When facing real data that usually contains multiple types of biases,these methods do not perform well.Moreover,due to the lack of theoretical understanding of biases,the reliability of most existing methods is difficult to guarantee through theory.2)Lack of adaptability:the effectiveness of these methods depends heavily on correct debiasing configurations(such as pseudo-labels and propensity scores).However,obtaining such appropriate configurations is quite difficult and requires manual optimization of debiasing configurations using domain expertise.Worse still,the so-called appropriate debiasing configuration may change over time because newly added user-item interaction records may change the original data distribution,which invisibly increases the difficulty of debiasing.This thesis mainly studies how to design a general debiasing framework to overcome the above two shortcomings and achieve universality and adaptability of debiasing methods.Specifically:1.We first analyze the source of bias from the perspective of risk discrepancy,that is,the discrepancy between the expectation of empirical risk function and the true risk function.Then,in order to bridge the gap between these two risks,we derive a universal automatic debiasing framework that uses a small amount of uniform data to assist in achieving unbiased training by introducing debiasing hyperparameters.Theoretical analysis shows that our automatic debiasing framework can include many existing mainstream debiasing strategies.Experimental results on two public datasets and one simulated dataset show that our automatic debiasing framework has great performance advantages compared with comparative methods and has high universality and reliability.2.Through computational complexity analysis of the automatic debiasing framework,we found that its high complexity is a limitation that makes it difficult to apply to large-scale industrial applications.Therefore,this study proposes a light version of the automatic debiasing framework,which greatly improves the efficiency of our model by introducing an adaptive sampling strategy.Through theoretical analysis,we prove that the unbiasedness of the risk function after sampling can still be guarantee.In addition,our adaptive sampling strategy can significantly reduce the sampling variance and improve the convergence speed of the model.Experimental results show that our adaptive sampling strategy can greatly improve the efficiency of model training without sacrificing model performance.
Keywords/Search Tags:Recommendation, Bias, Meta-learning, Sampling
PDF Full Text Request
Related items