| Data classification is one of the most fundamental tasks in machine learning.With modern computer technology gradually covering many industrial fields,it has been accessible to obtain massive data.By classifying the recorded data,the underlying and inherent distribution structure can be revealed.However,due to the complexity of the real world,the recorded data is characterized by uncertainty,multi-source and incompleteness.It means the uncertainty of data samples’classification in the feature space,multi-feature and insufficient labeled samples.The performance of traditional data classification methods is often limited.Belief functions theory provides a feasible mathematical framework for such complex data because of its good uncertainty representation ability and multi-source information fusion ability.Therefore,this thesis devotes to the data classification problem in the framework of belief functions theory.(1)For problems with multi-feature data,different features always have different importance in the classification task.Therefore,we propose a supervised classification model based on weighted feature evidence fusion in the framework of belief functions theory.Firstly,different features are taken as multi-source evidences to support the classification of samples,and the kernel density estimation technology is adopted to quantify the evidence.Secondly,the weights of different evidences are considered,and an optimization strategy minimizing the error of the predicted labels is designed to obtain the best value of evidences’weights.Finally,the belief partition of test samples is given by fusing multiple feature evidences.It measures the uncertainty of complex samples’ classification.The proposed model is a supervised classifier without pre-given hyper-parameters.Comparison experiments on commonly used datasets demonstrate the validity and superiority of the proposed model.Furthermore,the proposed model is applied to uncover the underlying corrosion mechanisms of atmospheric corrosion.It provides the technical support for corrosion science research.(2)For problems with only a few labeled samples,a semi-supervised classification model based on soft evidential label propagation is proposed in the framework of belief functions theory.Firstly,the basic belief assignment function is employed as a soft evidential label of sample to accurately quantify the uncertainty of sample’s classification.Secondly,a soft label propagation mechanism based on multi-source evidence fusion is designed.Under this mechanism,unlabeled samples will iteratively update their own labels by absorbing the label information of neighbor samples.It avoids the influence of the manually given belief threshold.The comparison experiments on the commonly used datasets show that the proposed model performs well on the data classification tasks of both multi-feature data and graph data.It is also robust to the parameters in model.Furthermore,the proposed model is applied to the prediction of atmospheric corrosion which is a practical engineering problem.It provides the technical support for the corrosion evaluation of materials in service.(3)For problems without any label information,we propose a soft label propagation clustering model based on the improved belief-peaks in the framework of belief functions theory.The proposed model aims at finding out the total number of the existed clusters and deriving the soft partition of the dataset.Firstly,an improved belief measure is proposed to quantify the possibility of a sample being a cluster center point,so as to accurately detect the cluster center samples and outlier samples without relying on large numbers of neighbor samples.Secondly,soft labels are employed to measure the uncertainty of samples’ classification.By combining the information of Euclidean distance and sample’s reliability,a new method to calculate the weights of neighbors is proposed.Finally,each sample will update its own label by partly absorbing the label information of neighbors.Comparison experiments on several commonly used datasets demonstrate that the proposed model performs superior in revealing the inherent distribution of data samples and is robust to the parameters in model.Furthermore,the proposed model is employed to study the corrosion resistance components of low alloy steel materials in seawater environment.It provides the technical support for the research and development of new materials.The work of this thesis not only improves the theoretical research on the data classification task,but also promotes the intersection with the material corrosion science.Processing corrosion data using the framework of belief functions theory initiates a new field of corrosion research. |