Font Size: a A A

Uncertainty Measurement For Label-Incomplete Data And Its Applications

Posted on:2022-10-04Degree:MasterType:Thesis
Country:ChinaCandidate:J L ChenFull Text:PDF
GTID:2480306731453294Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Uncertainty is a universal and social phenomenon in human cognition.With the rapid development of intelligent information processing,the incomplete data with partly missing label values,which called as label-incomplete data,exist widely in the real world.In the process of analyzing and mining the label-incomplete data,the uncertainty and difficulty of capturing effective information become greater with the increase of label incompleteness significantly,which brings lots of challenges to the analysis and processing of the complex data.Uncertainty measurement,as an effective tool for analyzing the uncertainty of data,can help us to reveal the essential characteristics of data and provide a novel strategy for data analysis.Existing uncertainty measurement approaches mainly aim at the complete data or the incomplete data with partly missing conditional feature values.However,there are relatively few studies on the uncertainty measurement of the label-incomplete data.Therefore,this work,from the viewpoint of rough-fuzzy set theory,conducts a series of uncertainty analysis researches on the label-incomplete data from the aspects of granular knowledge expression,uncertainty measurement model and the design of feature selection algorithms,etc.The main related works are shown as follows:· Aiming at the issues about the uncertainty and label noise of the labelincomplete data,the min-max label membership functions and a structure of fuzzy information granulation are constructed by the nearest neighborhood samples,which can describe the structural characteristics between the incomplete labels and conditional feature set on the whole sample universe.Then,the approximation operators based on rough-fuzzy set theory are further designed,and the uncertainty measurement approaches with robustness are proposed,i.e.,approximation accuracy and roughness.· The relationship between the traditional dependence measure and the model of rough-fuzzy set theory is investigated,and a novel uncertainty measurement approach based on the dependence measure is further proposed.Further,the effectiveness and robustness of the proposed approach are investigated from the theoretical and experimental analysis,respectively.· Aiming at the issue about the feature selection of label-incomplete data,the relationship between dependence measure and the relevance of feature is investigated,and two criterions of feature importance evaluation are designed.Then,the relevant theoretics about two evaluation criterions on feature selection are investigated,and a heuristic semi-supervised rough feature selection algorithm via dependency is proposed.Finally,some comprehensive experiments,which including classification performance and noise experiments,are implemented to analyze and inspect the effectiveness of the proposed semi-supervised rough algorithm.· Considering the issue of high computationally time-consuming in the proposed semi-supervised rough algorithm,especially in the large-scale data.A concept of the positive approximation is defined for dynamic granulation structure with the changed feature dimension,and an improved semisupervised rough acceleration algorithm via the positive approximation is proposed.It is found that not only can the proposed acceleration algorithm guarantee the classification performance of the original algorithm,but speed up the progress of feature selection by reducing the unnecessary calculation in the large-scale data.
Keywords/Search Tags:Rough set theory, Rough-fuzzy set theory, Label-incomplete data, Uncertainty measurement, Feature selection
PDF Full Text Request
Related items