Font Size: a A A

Cluster Analysis Based On Discrete Hash

Posted on:2024-06-26Degree:MasterType:Thesis
Country:ChinaCandidate:S T XuanFull Text:PDF
GTID:2568307055470764Subject:Electronic information
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology,the world has stepped into the era of multimedia big data with the support of the Internet.The enormous social and eco-nomic value of the vast amount of multimedia information available brings not only new opportunities but also new challenges to the progress and development of society.Faced with the massive multimedia data,how to effectively to process these data effectively and perform effective image clustering is a hot issue in the current computer science field.A-mong k-means based clustering algorithms,hash representation learning has received a lot of attention due to its high efficiency and low storage cost.However,in the widely used image and text data,its high feature dimensionality and large data scale make hash-based learning still have big problems in the clustering process in terms of clustering efficiency and clustering time.In the face of richly sourced multimedia data,how to use the effective information between views and design efficient clustering algorithms is an urgent problem to be solved at present.Comprehensive analysis of the above,the following three clustering algorithms are proposed to improve the efficiency of clustering around the three problems of high dimen-sionality,large scale and multiple views of multimedia data.(1)A binary hashing method based on automatic feature selection is proposed for im-age clustering to address this problem of high-dimensional data.Firstly,the adaptive feature selection feature of l2,1parametrization is applied to the input data for automatic feature selection,and the most useful features in the original data are selected through multiple iter-ations to complete the dimensionality reduction of the data.The hash function is then used to project the high-dimensional data into the low-dimensional space.Low-rank matrix decom-position and spectral embedding of the reduced dimensional data in the low-dimensional,sparse Hamming space,so that completing the clustering in the binary Hamming space.The proposed method is verified by experimental results to have good clustering performance as well as high efficiency.(2)An unsupervised hashing method with adaptive loss functions for feature selection is proposed for image clustering tasks to address this problem with large-scale data.First,an unsupervised hash learning model and binary clustering learning are combined as a joint optimisation objective.Then,a joint adaptive loss function that lies between the l1and l2paradigms and combines the advantages of both is used to enhance robustness to outliers.Finally,a low-rank matrix decomposition and spectral embedding are applied to the binary data and fast clustering is performed in Hamming space.The superiority of the method in terms of clustering performance is verified experimentally.(3)A redundant multi-kernel clustering with generalization error bounds is proposed for the problem of multi-view data.It integrates the generation of clustering labels and the learning process of consistent division matrices into a unified framework,while introduc-ing non-redundant regularisation algorithms to reduce redundancy between views.Finally,the algorithm is analysed theoretically for generalisation error bounds.The superiority of the clustering performance as well as the efficiency of the proposed method is verified by experimental results.
Keywords/Search Tags:Hashing, Image clustering, Multi-view, Projection, Error learning
PDF Full Text Request
Related items