With the rapid development of data collection technologies,high-dimensional data,such as online documents,e-books,photos,social media video series,and health data,are becoming increasingly common.Therefore,generic computational methods are needed to extract the inherent low-dimensional information.In the context of the big data era,the high-dimensional and complex hierarchical structure of image data has led to many challenges in processing images,and thus various dimensionality reduction techniques have emerged.In recent years,nonnegative matrix factorization has attracted more and more attention due to its direct interpretation of nonnegative results.Technically,nonnegative matrix factorization attempts to identify the product of two nonnegative matrices that provide a good approximation of the original matrix and can effectively express the relationship between the local and the whole.Although nonnegative matrix factorization performs well in downstream tasks such as image processing and cluster analysis,its model still has the following problems:(1)The existing high-dimensional data has complex noise.(2)The single-level structure cannot extract multi-level features.(3)The problem of how to apply a small amount of supervised information to improve the clustering accuracy needs to be addressed.In this paper,three algorithmic models are proposed to solve the corresponding three problems from the perspective of learning tasks related to nonnegative matrix factorization,respectively.The main research works and results are as follows:(1)The problem that the existing high-dimensional data are noisy and a single nonnegative matrix factorization model cannot separate a clean and effective data space,we propose a non-smooth graph regularized low-rank nonnegative matrix factorization.The algorithm first decomposes the original dataset into low-rank approximation data and noise using a nonnegative matrix factorization model,and then performs a non-smooth nonnegative matrix factorization on the low-rank approximation of the original high-dimensional data to obtain the base dataset and the corresponding low-dimensional representation.In addition,to maintain the geometric structure of the original high-dimensional data,we construct graph canonical terms for the obtained feature data to improve the robustness of the data.Finally,the low-rank decomposition model and the non-smooth nonnegative matrix factorization model with graph regular terms are combined into a balanced model framework to obtain the final algorithmic model.Compared with other nonnegative matrix factorization models,our proposed algorithm can reduce the redundant information in the feature space and demonstrate excellent clustering results on image datasets.(2)To address the problem that when the nonnegative matrix factorization model is applied to high-dimensional data for clustering,it cannot extract multi-level features in complex images due to its single-level structure,and therefore cannot mine the complex interleaved features that constitute high-dimensional data,we propose a deep non-smooth graph regularized low-rank nonnegative matrix factorization.To further obtain complex hidden information and maintain the geometric structure of high-dimensional data,we combine the deep nonnegative matrix factorization model to learn the data representation in the low-dimensional space of the original dataset.Experiments on real datasets demonstrate that our proposed model can effectively capture the feature structure of the data itself and improve the interpretability of the model.(3)To address the problem that the nonnegative matrix factorization model,as a typical unsupervised learning method,does not have any labels for the data input to the model,and its algorithm does not know the exact output until the end of the algorithm,so there is model inaccuracy and lack of guidance information,we propose a semi-supervised adaptive graph regularization low-rank nonnegative matrix three-factor factorization.The algorithm uses partially labeled data information to improve the model accuracy by the idea of uniform optimization model propagation through labeling constraints,processes the original data through nonnegative matrix triple factorization to obtain the low-rank processed feature information,and maintains the feature structure of the data through adaptive composition.In addition,the algorithm can guide the unlabeled data of the same prediction label to map into the same class,enhancing the representation discrimination of data features.Also,we design an effective iterative update optimization scheme to solve the proposed algorithm model.Extensive experimental results show that our proposed method effectively improves the clustering effect when compared with several other state-of-the-art nonnegative matrix factorization variants. |