Font Size: a A A

Research And Application Of Adaptive Label Thresholding Algorithm For Online Multi-label Classification

Posted on:2024-01-29Degree:MasterType:Thesis
Country:ChinaCandidate:H C TangFull Text:PDF
GTID:2568306917954029Subject:Master of Electronic Information (Professional Degree)
Abstract/Summary:PDF Full Text Request
Multi-label classification learning has a wide range of applications and importance in fields such as image recognition,text classification,and bioinformatics,etc.Although the field has made many advances in multi-label classification learning,there are still some problems that need to be addressed.First,in most thresholding strategies,the threshold model is learned mainly based on the output of the scoring model for all training instances,and thus is closely related to the scoring model,while most traditional multi-label classification algorithms use fixed-label thresholds or independent threshold functions as parameters,and the relationship between the threshold model and the scoring model has not been investigated.Besides,the loss function and bias term,which are important components of support vector machines,are not investigated in most support vector machine based algorithms.To address these issues,this paper proposes a framework to jointly optimize the scoring model and the thresholding model,and proposes two algorithms based on this framework,the adaptive label thresholding algorithm and the fixed label thresholding algorithm,and investigates the effects of different loss functions and whether bias terms are added on the adaptive label thresholding algorithm.Both algorithms update the model in an incremental manner,and both models are integrated into a single optimization problem,which is experimentally validated and analyzed on a publicly available dataset.The study covers the following four main areas:(1)An online multi-label classification framework that jointly optimizes scoring models and threshold models is proposed.The core idea of this framework is to use threshold and scoring models as important components of an online multi-label classifier and combine them into an online optimization problem.Under this framework,two algorithms are proposed,namely,adaptive label thresholding algorithm and fixed label thresholding algorithm,which are both optimized using online gradient descent.Finally,it is experimentally demonstrated that both adaptive label thresholding and fixed label thresholding algorithms have advantages in multiple multi-label performance metrics.(2)The effect of different loss functions on the adaptive label thresholding algorithm is investigated.The adaptive label thresholding algorithm is designed based on the hinge loss function.Therefore,three binary classification loss functions are extended to multi-label classification loss functions and three new adaptive label thresholding algorithms are proposed based on them.The effects of different loss functions on the performance of the adaptive label thresholding algorithm are analyzed,and experiments are conducted on six data sets in comparison with five state-of-the-art online multi-label classification algorithms.The experimental results show that the adaptive label thresholding algorithm based on the logarithmic loss function has the best performance,which proves the effectiveness of the algorithms.(3)The effect of bias terms on the adaptive label thresholding algorithm is investigated.The adaptive label thresholding algorithm is a support vector machine based multi-label classification algorithm with excellent performance,but it does not consider the role of bias terms.In the single-label classification problem,the bias term has an important impact on the performance of the support vector machine,especially in the case of unbalanced label distribution.The problem of unbalanced label distribution is common in textual datasets.We conducted comparison experiments with the original algorithm on nine text datasets,and the results show that the adaptive label thresholding algorithm with bias terms has higher performance.(4)An online multi-label classification system based on adaptive label thresholding algorithm is developed and designed,which can customize different tasks and algorithms to build different models according to users’ needs.Users can customize personalized tasks according to the algorithm of feature extraction and the number of labels.Users can also customize different algorithms based on different loss functions and whether to use bias terms.For each task,the user can choose the appropriate algorithm for modeling.During the model building process,users can customize various hyperparameters.After the model is built,the user updates the model in increments each time the model is trained.
Keywords/Search Tags:Online learning, Multi-label classification, Label threshold, Loss function, Bias term
PDF Full Text Request
Related items