Research On Online Recommendation Based On Contextual Combinatorial Bandit Algorithm

Posted on:2023-01-19

Degree:Master

Type:Thesis

Country:China

Candidate:H X Han

Full Text:PDF

GTID:2568307076985479

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Recommender systems play a crucial role in the Internet age of information overload,and the recommendation process need make a trade-off between two goals:exploring new items to maximize user satisfaction and exploiting those already interacted with to match user interests.This problem is widely recognized as the exploration-exploitation(EE)dilemma.The bandit algorithm has proven to be an effective solution,and thus has been widely studied and applied to the recommendation field.As the scale of users and items in real-world application scenarios increases,formalizing the online recommendation task as a bandit problem poses three challenges:first,sparse interactions between users and items increases the difficulty of mining user preferences.Second,as items are added to the system,modeling single item as an arm brings the problem of large-scale arms.Third,the widely used Bernoulli reward mechanism does not take full advantage of the rich implicit feedback information in the recommender systems.In this paper,we mainly focus on the above problem and propose the dynamic clustering-based contextual combinatorial bandit algorithm(DC~3MAB),which consists of three key components:dynamic user clustering strategy,item partitioning approach and multi-class reward mechanism.Specifically,to accurately mine users’interest preferences from their sparse interaction behavior,dynamic user clustering strategy clusters users with similar preferences into the same cluster,and users located in the same cluster can share the same bandit parameter information.To solve the problem of not being able to grasp the reward distribution information of each arm and the high computational complexity brought by large-scale arms,the item partitioning approach focuses on quickly filtering a few subsets of items from total items based on their current interactions and modeling an item set as an arm.To capture user’s potential preference from the complex interaction behaviors,the Multi-class Reward Mechanism sets different reward weights for different interaction types,which reflects the different preference level for recommended items.Additionally,based on the clustering bandit,we further propose the collaborative combinatorial bandit(CoCoB)algorithm with the idea of two-side bandit to achieve adaptive user clustering.Specifically,user-bandit is based on an improved Bayesian bandit,which models users as arms to explore similarity between users.By setting a similarity probability threshold to judge whether there is a neighboring user of the target user,it adaptively utilizes information about the neighboring users’preferences or personal preferences to assist in recommendation decision.Item-bandit models items as arms and leverages the preference information from user-bandit to provide a list of recommendations at a time to increase recommendation variety.Extensive empirical experiments on three real-world datasets demonstrate the superiority of our proposed DC~3MAB and CoCoB over state-of-the-art bandits.

Keywords/Search Tags:

personalized recommendation, contextual multi-armed bandit, exploration-exploitation, collaborative filtering

PDF Full Text Request

Related items

1	Research Of Recommendation Algorithm Based On Contextual Restless Multi-armed Bandit Model
2	Research On The Algorithm Of Personalized Learning System Based On Multi-armed Bandit
3	Study On Relay Selection Algorithm Based On Multi-Armed Bandit In Underwater Acoustic Cooperative Communication Networks
4	Research On Online Recommendation Method Based On Multi-behavior Implicit Feedback
5	A Contextual Bandit Approach To Personalized Online Recommendation Via Sparse Interactions
6	Research On Real-time Online Recommendation Method In Recommendation System
7	Research On Channel Selection Mechanism Based On Multi-armed Bandit In Cognitive Network
8	Optimization Method And Application Of Combinatorial Multi-Armed Bandit With Fairness Constraint
9	Research On The Improvements Of Collaborative Filtering Personalized Recommendation Algorithm
10	Research On E-commerce Personalized Recommendation Algorithm Based On Collaborative Filtering