Sampling Techniques For Data Sparsity In Recommender Systems

Posted on:2024-03-03

Degree:Doctor

Type:Dissertation

Country:China

Candidate:J Chen

Full Text:PDF

GTID:1528307301977569

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

The rapid evolution of Internet technologies has led to an increase in online user engagement,resulting in an exponential increase in the volume of accessible information.This flood of data often frequently exceeds what users can efficiently handle,resulting in the problem of information overload.To address this critical issue,recommender systems play a crucial role by seeking user interests and preferences based on their historical behaviors.Through the assistance of recommender systems,users gain access to fresh and valuable information resources,enhancing the efficiency and quality of their information acquisition processes.Personalized recommender systems have become increasingly ubiquitous in nowadays life,and they will continue to play an important role in driving economic and social development in the next phase of growth,thereby providing people with greater convenience and value in both their personal and professional lives.Many recommendation algorithms heavily rely on historical user-item interaction data,including user ratings,click behavior,purchase history to discover the similarity between users and items.Unfortunately,due to constrained computational resources,users only engage with a subset of the available information.This high sparsity has a negative impact on recommendation accuracy,often falling short of meeting users actual demands.Consequently,addressing the issue of feedback sparsity is important for progressing and improving recommender systems.The sparsity challenge within recommender systems manifests in two ways: the scarcity of positive feedback and the absence of negative feedback.This dissertation delves into both aspects of research,with a particular emphasis on sampling methodologies.Furthermore,this research takes a comprehensive approach,with the goal of improving recommender algorithm performance,dynamically capturing shifts in user preferences,and accelerating algorithm convergence by exploring the following critical challenges:1.Tailored Thompson Sampling: In situations characterized by the lack of positive samples,where the item exposure rate is notably low,especially in comparison to the vast number of online users and items,recommender systems face challenges in accurately predicting user interests.This imbalance results in a significant bias towards user preferences.Thompson sampling is a well-established technique for striking a balance between exploration and exploitation.However,it struggles to adapt to complex models and interaction data distributions.Addressing this issue involves devising Thompson sampling methods tailored for feature interaction models to achieve precise approximations of the posterior distribution.This represents a crucial solution for effectively uncovering user interest preferences,ultimately mitigating the bias issue caused by the lack of positive samples.As a result,it enhances the cumulative click-through rate for users within a finite number of item exposures.To tackle this problem,this dissertation introduces a Thompson sampling approach based on variational inference.This approach is designed to create a customized parametric posterior approximation for the click-through rate estimation model with automatic search interactions,efficiently capturing variations in user interests and increasing the positive feedback within the constraints of limited user exposures.2.Bias Reduction of Static Samplers:With the lack of negative samples,the sampling solutions,which select a subset of items to approximate computations across the entire dataset,proves to be advantageous in terms of algorithm efficiency.Commonly employed static sampling methods,such as uniform sampling,while efficient,suffer from a significant bias issue when compared to the ideal softmax distribution.This bias poses a challenge in distinguishing more challenging samples during the training process.As training progresses,these samplers may oversample easy samples that the models have already effectively learned.Resolving this dilemma involves correcting the bias within the static sampling distribution,thereby enhancing the quality of sampled items while preserving sampling efficiency.This adjustment represents a pivotal avenue for enhancing the ultimate recommendation quality of the static sampling distribution.To tackle this problem,this dissertation introduces a novel sampling approach based on importance resampling.This approach dynamically adapts the static sampling distribution relying on item similarity scores,yielding tractable sampling probabilities.Consequently,the resampling method diminishes the deviation between the static sampling distribution and the expected dynamic distribution,thereby boosting the performance of the recommendation model.3.Accurate Approximation of Dynamic Samplers: With the lack of negative samples,dynamic samplers often select items with higher relevance scores as high-quality negative samples to enhance model training.However,these samplers encounter challenges in large-scale recommendation settings,as the computational cost grows linearly with the number of items.Additionally,existing dynamic sampling methods struggle with the problem of intractable sampling probabilities and significant deviations from the expected sampling distribution.This imbalance limits the capacity to strike an efficient balance between utility and efficiency.Designing dynamic sampling methods that can precisely approximate the expected distribution while ensuring efficient sampling is a critical endeavor,which aims at enhancing both the convergence speed and effectiveness of model.To address this challenge,this dissertation introduces a dynamic sampling approach based on an inverted multi-index structure.This innovative method reduces the full sample computation to the computation of centroids within each subspace.As a result,it achieves a more accurate approximation of the ideal softmax distribution,facilitating the rapid convergence of the recommendation model.4.Cache-Augmented Sampling Pool: With the lack of negative samples,the inbatch sampling method encounters the sample selection bias challenge within the constraints of computational resources.The limited number of candidate items,coupled with their skewed distribution,introduces substantial deviation from the desired distribution and consequently harms the final recommendation performance.One approach to enhance sample diversity is to expand the sampling pool through uniform item sampling.However,this primarily improves the approximation results by increasing the number of sampled instances,without effectively adapting to the dynamically changing expected distribution.Addressing the dynamic augmentation of the sampling pool,which can adapt to model adjustments and yield high-quality training samples for the candidate sampling set,is a critical concern for enhancing the accuracy of recommendation retriever models.To tackle this challenge,this dissertation introduces an augmented sampling pool based on sample occurrence estimation,where the importance of each item is assessed by the frequency of sampling.By implementing the augmented sampling pool,samplers can capture the dynamic change in training data,enabling them to select superior-quality samples and ultimately enhancing the model’s capacity to predict user preferences.

Keywords/Search Tags:

Recommender Systems, Sampling, Thompson Sampling, Negative Sampling

PDF Full Text Request

Related items

1	Thompson Sampling for the Control of a Queue with Demand Uncertaint
2	Research On In-batch Negative Sampling And Utilization Of Recommendation Systems
3	Minimum-Phase Control In Sampled Systems And Study Of Non-Nyquist Sampling
4	Sampled-data Control For Two Classes Of Continuous-time Systems Under Different Sampling Schemes
5	Research On Sampling Nonlinear Filter
6	The Research On Sampling For Data Mining
7	Design And Implementation Of Co-prime Sampling System
8	Design And Implementation Of Signal Sampling System
9	The Key Technologies Of Software Radio Realization
10	The Approving Study Of Sampling Technology Used In Data Minng Area