| With the diversified applications of information technology,a large amount of data in various forms,such as text,image and gene,is produced daily.How to transform data into organized knowledge and excavate valuable information has become a key research goal in the era of big data.It has attracted extraordinary attention from the scientific community,the business community and the governments of various countries.Unsupervised learning methods,especially clustering analysis for large-scale data,have become essential techniques to help people explore unknown data.Graph learning is one of the main research directions in the field of clustering,which is widely used in hierarchical clustering and spectral clustering algorithms.However,these methods still have some defects: most hierarchical clustering methods are sensitive to noise,and the clustering performance in noisy data is significantly degraded; Affected by heterogeneous information between different views and high-dimensional complex features,multi-view clustering cannot accurately learn the unified graph.To address the aforementioned limitation,this dissertation proposes several more robust and effective graph learning models to improve clustering performance.Specifically,the research work of this dissertation includes the following four aspects:(1)Most graph-based hierarchical clustering methods only consider the pairwise distance of data in the graph construction process,which leads to sensitivity to noise and outliers.To address this problem,a new adjacency graph learning method is proposed by combining the pairwise distance and the reconstruction coefficient,which is robust to noise and outliers.The main characteristic of the reconstruction coefficient is the selfrepresentation of data,i.e.,each data point can be reconstructed by a linear combination of other data points.The constructed adjacency graph takes advantage of both the distance between data points and the linear representation among data points,so it not only captures the local structure of data well but is also robust to noise and outliers.Based on the constructed adjacency graph,a new agglomerative hierarchical clustering algorithm is developed,and then the effectiveness of the proposed method is verified on some real datasets.(2)Most graph-based hierarchical clustering methods treat all features equally,making them susceptible to noise features in graph construction.An adjacency graph learning method is proposed to address this problem based on adaptive weighting and manifold regularization.In this method,an adaptive weighting matrix is embedded in the reconstruction error term to enhance the influence of important features in graph learning.It eliminates the influence of noise from the feature space so that the proposed method can obtain a relatively robust adjacency graph.Meanwhile,manifold regularization is introduced to capture the group effect in self-representation so that the reconstruction coefficients of the data are smooth concerning the intrinsic data manifold.In addition,the insignificant reconstruction coefficients are truncated to eliminate the further influence of noise from the representation space to obtain a block-diagonal and more efficient adjacency graph.Based on the proposed adjacency graph,a new agglomerative hierarchical clustering algorithm is developed,and then the effectiveness of the proposed method is verified on some real datasets.(3)It is easy for the existing multi-view spectral clustering based on anchor graph to be affected by heterogeneous information among different views,which leads to limited clustering performance.To address this problem,a unified anchor graph learning method based on low-rank tensor approximation is proposed.Firstly,the affinity matrix of the unified anchor graph is initialized by averaging the anchor graphs of multiple views,and a confidence affinity matrix is constructed to encode the affinity relationships with strong consensus explicitly.Then,the two matrices are formed into a third-order tensor.Based on the low-rank tensor approximation,the reliable information in the strong confidence affinity matrix is used to correct the initial joint anchor graph,so as to reduce the influence of heterogeneous information.Moreover,an efficient alternating iterative optimization algorithm is designed to solve the low-rank tensor optimization problem.Finally,the effectiveness of the proposed method is verified on some real multi-view datasets.(4)Most multi-view spectral clustering methods only consider the shallow information of the data in the process of graph construction,which is difficult to process the data with high-dimensional and complex features effectively.An adjacency graph learning method based on contrastive deep matrix factorization is proposed to address this problem.In this method,deep matrix factorization and reconstruction coefficient-based graph learning are combined into a unified model to make use of the hierarchical feature information of data for the graph construction.To solve the scalability problem of matrix factorization,the model uses a variant autoencoder network to approximate the multiview deep matrix factorization and increases the activation function to ensure that the output of each layer has certain constraints.Firstly,a variant autoencoder network with a shared encoder and multiple decoders is constructed to transform multi-view fusion into the problem of finding a consistent adjacency graph.Meanwhile,the model also uses the graph contrastive learning constraint to mine the local structure in the data.Then,the model uses the gradient descent method to update the parameters in the network through backpropagation.Finally,the effectiveness of the proposed method is verified on some real multi-view datasets.In summary,starting from hierarchical clustering and spectral clustering based on graph learning,this dissertation proposes several graph learning models for some key problems and verifies the effectiveness of the proposed models through comprehensive experiments. |