Font Size: a A A

The Research On Label Enhancement-Based Multi-Label Feature Selection Algorithm With Fuzzy Rough Sets

Posted on:2023-01-19Degree:MasterType:Thesis
Country:ChinaCandidate:C Z XiongFull Text:PDF
GTID:2568306803462714Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of data mining,the high-dimensional feature space in data is a significant challenge of the traditional supervised learning.As a crucial pre-processing step for learning tasks,feature selection is utilized to mitigate the “curse of dimensionality”caused by irrelevant and redundant features in high-dimensional feature space.And the theory of fuzzy rough sets,as an effective tool in feature selection,can handle the vagueness of data under the continuous features,which has attracted significant attention recently.However,distinct from traditional supervised learning,in multi-label learning,the label ambiguity is the most prominent characteristic of the data,where an object is related to multiple semantics simultaneously.In addition,in most real-world applications,the difference among relative importance of different labels to the specific instance is widespread,while the assumption of uniform distribution in multi-label learning can not describe this scenario well.Therefore,based on the theory of fuzzy rough sets,we investigate multi-label feature selection from the above two perspectives in this paper,and obtain relevant research results as follows:1.Based on the framework of fuzzy discernibility matrix,in this paper,the fuzzy label discernibility relation and fuzzy relative discernibility relation are defined,to assess the discernibility between object pairs under the feature and label spaces.Under the discernibility relations between objects,the discernibility capability of the feature is utilized to assesses the relevance of the features to the labels,and a discernibility-based significance is defined for the features.On this basis,a ranking-based method with fuzzy discernibility is presented to finish the feature selection in multi-label data.Finally,to validate the effectiveness of the presented method,the comparison experiments are conducted with four representative multi-label feature selection approaches on ten selected multi-label datasets and the multi-label classifier MLKNN.Under six widely-accepted evaluation metrics,the experimental results indicate that the proposed method attain the superior performance on the feature selection of multi-label data.2.To describe the relative difference among labels,label distribution learning is integrated into multi-label feature selection,which is utilized to mine the more supervised information ignored by equivalence relations in the label space.In this paper,a novel label enhancement method is proposed based on the perspective of granular computing,where the fuzzy similarity between instances is calculated to mine the hidden label relevance and obtain label distributions from the logical labels.To handle the vagueness of label distribution data,by combing the with the fuzzy rough sets with the discrimination index,the fuzzy neighborhood discrimination index is presented to assess the discernibility of features with continuous values.Then,a novel label enhancement-based feature selection approach is presented to handle the high-dimensionality of multi-label learning.Furthermore,to declare the validity of the presented algorithm,a collection of comparison experiments are conducted with five representative multi-label feature selection methods under twelve multi-label datasets.Under six widely-used evaluation metrics,the experimental results indicate that the proposed algorithm achieves superior performance against compared algorithms.
Keywords/Search Tags:Feature selection, Multi-label data, Fuzzy rough sets, Label enhancement, Discrimination index
PDF Full Text Request
Related items