| In recent years,due to the booming of artificial intelligence technology,clustering as an unsupervised learning method in the field of artificial intelligence,has been studied extensively by scholars.Since the fuzzy theory-based clustering algorithm,namely fuzzy clustering algorithm,was proposed by scholars,it has been widely studied due to its advantages of fast convergence and easy implementation.Thus it has been applied in various application scenarios,such as data mining,pattern recognition and image processing,etc.With the advent of the era of big data,the data generated by human beings in production and life is becoming more and more complex.The complexity of data mainly lies in the fact that the collected data is contaminated by noise,the scale of the data is getting larger and larger,the dimension of the data is getting higher and higher,and the data is collected from multi-views.For clustering complex data,traditional fuzzy clustering algorithms now face several challenges:(i)To cluster data contaminated by noise,traditional fuzzy clustering algorithms cannot overcome the influence of noise,which degrades their performance.(ii)In large-scale data scenarios,it is difficult for traditional fuzzy clustering algorithms to process data in an effective time.(iii)Unimportant features and redundant features existing in high-dimensional data and multi-view data hinder the algorithm to discover the information hidden in the data.This dissertation studies the clustering technology of complex data,including noise-contaminated data,large-scale data clustering,moreover,feature weight learning techniques in high dimension data and multi-view data are also are involved.Contributions of the dissertation include:(1)To clustering data contaminated by noise,an entropy-based possibilisitc clustering algorithm with anti-noise feature named A-EPCM is proposed.The algorithm has the following characteristics:(i)It is simple in implementation.Both A-EPCM and EPCM share the same principle and implementation process.The A-EPCM algorithm is proposed by adding a noise suppression term to the objective function of EPCM to facilitate it has the feature of anti-noise.(ii)Most the-state-of-the-art clustering algorithms try to remove noise to overcome the influence of noise in data,while A-EPCM algorithm does not need to remove noise and demostrates good anti-noise peculiarity.(2)For large-scale data,due to the limitation of RAM,all data cannot be loaded into RAM,and traditional clustering algorithms are not suitable for processing large-scale data.A fuzzy clustering algorithm termed MMPFC is proposed.The algorithm randomly divides the large-scale data into blocks,uses the clustering algorithm to obtain the weight of data points to represent each cluster,then selects multiple representative according to the value of the weight,and selects an appropriate amount of representative points to add to the subsequent data blocks for clustering,so as to improve the clustering performance of the algorithm.(3)Aiming at the problem that the feature weighted clustering algorithm cannot correctly assign feature weights,a weighted fuzzy clustering FWL-FWCM algorithm based on feature weight learning is proposed.The algorithm obtains the optimal solution of thefeature evaluation function by using the gradient descent method,and uses the solution as the feature prior weight.In the objective function of the proposed algorithm,the intra-cluster variance and the weight of the cluster are combined with the feature prior weight.The optimal solution of the objective function is obtained by using the alternative iteration method.Because the algorithm utilizes the prior feature weight information in the clustering processes,it can overcome the problem of sensitiveness to initialization and assign the correct weight to features.(4)The MKM(marginal kurtosis measure,MKM)index is proposed to measure feature importance,and based on this index,a novel and robust feature reduction fuzzy clustering algorithm called FRFCM-MKM is proposed.Since the FRFCM-MKM algorithm is insensitive to initialization and can correctly assign weights to features,the algorithm can correctly identify important and unimportant features in the clustering process.Therefore,when the feature reduction operation is performed,the algorithm can correctly delete unimportant features and retain important features,so that the FRFCM-MKM algorithm can not only speed up the clustering speed,but also improve the accuracy of the clustering.(5)Because unimportant features in the multi-view data affect clustering performance,a feature reduction multi-view fuzzy clustering algorithm named FRMV-FCM is proposed,which extends the feature reduction fuzzy clustering technology to multi-view data.The FRMV-FCM algorithm designed according the objective function,learns feature weight from data and automatically assign feature values to the features of each view,so that important features are assigned a larger weight value,and unimportant features are get a smaller weight value.When the feature weight value is less than the preset threshold,the FRMV-FCM algorithm automatically deletes the features with smaller weight values to finish feature reduction,so as to improve the clustering quality and speed up clustering process. |