Font Size: a A A

Research And Application Of Non-negative Matrix Factorization Algorithm Based On Privacy Protection And Interpretability

Posted on:2023-08-18Degree:MasterType:Thesis
Country:ChinaCandidate:S H ZhangFull Text:PDF
GTID:2558306908464564Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Driven by the development of the Internet,massive data,as an emerging important resource,has become a key cornerstone for promoting social development.The challenge that comes with massive data is how to store,mine and process it effectively.As a common data orga-nization form in the field of data mining and machine learning,it is of great significance to reduce the dimension of high-dimensional matrices under the background of Big Data.As a new dimension reduction method,Non-negative Matrix Factorization(NMF)can not only mine the potential features of high-dimensional data through low-rank approximation,but also enhance its interpretability due to the introduction of non-negative constraints,which has more practical physical significance.Nowadays,NMF has been widely used in text clus-tering,recommendation systems,image analysis,speech processing and other fields.On the basis of summarizing the existing NMF algorithms,aiming at the shortcomings of the exist-ing algorithms,this thesis focuses on improving the privacy protection capability,clustering performance and computational efficiency of the NMF algorithm in practical applications,and proposes corresponding improvement measures and solutions.Around these three as-pects,the main content of this thesis can be summarized as follows:(1)In order to strengthen the privacy protection ability of NMF applied to recommendation systems,in view of the shortcomings of existing methods that have poor privacy protec-tion ability,require pre-training in advance,and require large amounts of computation,a differential privacy NMF algorithm based on random sampling Gaussian mechanism(RDP-NMF)is proposed.By introducing a random sampling noise addition mechanism in the iterative process,this method not only controls the amount of noise to improve its usabil-ity,but also avoids pre-training and reduces its computational load.In addition,the privacy proof of RDPNMF algorithm is given.Finally,the performance of this algorithm is verified on the Movie Lens datasets,and the selection of its parameters is analyzed and discussed.The experimental results and analysis show that the RDPNMF algorithm has excellent privacy protection ability and stronger usability.(2)To solve the clustering problem in machine learning,a symmetric non-negative matrix factorization model based on sparse graph regularization(SG-Sym NMF)is proposed.The model comprehensively considers data sparsity and potential geometric features.By im-posing l1regularity constraints and Laplacian graph regularity constraints,the multi-angle information of the data can be mined to improve its clustering performance.In addition,the solution algorithm of the model is designed and the theoretical convergence is proved.The experimental results show that compared with the existing methods,SG-Sym NMF has improved the five clustering performance indicators of ACC,NMI,PUR,ARI and F1-score.The robustness of the proposed algorithm is also demonstrated by the discussion of algorithm parameters on different scale data sets.(3)Focusing on deep unfolding of non-negative matrix factorization algorithm,in order to improve the reconstruction performance,a deep unfolding non-negative matrix factorization network based on decoupling parameters(D2NMF)is proposed.The D2NMF network model is based on the iterative algorithm deep unfolding framework,draws on the idea of LISTA-CPSS decoupling parameters,and improves the actual performance by reducing the amount of network parameters.The experimental results verify the practicability of the proposed D2NMF network in processing simulated mutation data,and the reconstruction performance is better than that of traditional methods and existing network methods;in addition,the in-fluence of the number of layers on the results is investigated,and the convergence of the proposed network is verified.
Keywords/Search Tags:non-negative matrix factorization, differential privacy, Gaussian mechanism, regular constraint, deep unfolding, decoupling parameter
PDF Full Text Request
Related items