Research On Tendentious Label And Streaming Data Feature Selection Algorithm

Posted on:2020-03-16

Degree:Master

Type:Thesis

Country:China

Candidate:F Chen

Full Text:PDF

GTID:2417330575996215

Subject:Statistical information technology

Abstract/Summary:

PDF Full Text Request

Multi-label learning,which is aiming at the ambiguity in real word,has been studied and paid attention by many scholars since it was proposed.Multi-label learning improving the accuracy of description by expanding the classification result from single-label to multi-label;Label distribution learning converts traditional logical labels into probability distribution which express the description degree of labels more intuitively.Feature selection improved the accuracy of the classifier by reducing the high dimensionality of massive data.However,the traditional feature selection algorithm is computationally complex and cannot process streaming feature data.Based on this,two feature selection algorithms are proposed in this paper to solve the above problems.The main contents are as follows:(1)Aiming at the problem about the tendentious label which causes by the granularity of the feature description in multi-label learning algorithm,many researchers often use the relationship between the features with all the label to obtain some important features and then construct the corresponding feature subspace.But those approaches will always make some features for a label with a strong relevance,but no correlation for the whole label space.Therefore,the accuracy of the classifier will inevitably descend because some features will be unselected by those traditional approaches.Meanwhile,many mutual information approaches of the feature selection often use the traditional entropy method in the multi-label learning algorithms at present.The traditional entropy has a high complexity of computation because it has no nature of the complement.Therefore,a new definition about the rough entropy will be introduced and then the algorithm of the k-Kernel Feature Selection based on Rough Mutual Information(kKFSRMI)was proposed in this paper.At last the experiments show kKFSRMI is effective.(2)Traditional feature selection algorithm cannot process the stream feature data,the redundancy calculation is complicated and the description of the instance is not accurate enough.A Multi-label Distribution Learning Feature Selection with Streaming Data Using Rough Set(FSSRS)was proposed to solve the above problem.Firstly,the online streaming feature selection framework was introduced into multi-label learning.Secondly,the original conditional probability was replaced by the dependency in rough set theory,which made the streaming data feature selection algorithm more efficient and faster than before by only using the information calculation of the data itself.Finally,since each label has a different degree of description for the same instance in real word,to make the description of the instance more accurate,label distribution was used to instead of traditional logical labels.The experimental results show that the proposed algorithm can retain the features with a high correlation with the label space,so that the classification accuracy is improved to a certain extent compared with that without feature selection.

Keywords/Search Tags:

Multi-label learning, label distribution learning, feature selection, streaming data, rough set, rough mutual information

PDF Full Text Request

Related items

1	Research On Multi-label Learning And Its Application In Text Classification
2	Multi-label Feature Selection Based On Gravitational Field Model
3	Multi-label Learning With Non-equilibrium Labels Completions And Its Application
4	Multi-domain Data Classification Based On Multi-instance Multi-label Learning
5	Feature Selection Based On Rough Set For Binary-class Imbalanced Data
6	Multi-Label Learning Based On Metric Learning And Optimizing The Ranking
7	Research And Application Of Label Learning Based On Mixture Kernel Extreme Learning Machine
8	A Multi-label Learning Algorithm Combining Regression Kernel Extreme Learning Machine With Association Rules
9	Missing Multi-label Learning Of Imbalanced With Label Reconstruction
10	Research On Characteristics Reconstruction Of Label Distribution Learning And Its Application In Emotion Recognition