Font Size: a A A

Representation Algorithm And Its Application Based On Non-negative Matrix Factorization Data

Posted on:2014-08-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y ZhengFull Text:PDF
GTID:2260330425488010Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
With the development of socio-economic conditions, the ways of acquisition data are more and more. A large number of high-dimensional data is required to analysis in real applications. However, the "curse of dimensionality" of the data is often suffered, so that it is a difficult for the post-processing of data. Therefore, it is necessary to represent data before dealing with high-dimensional data. In the real application, the "dimensionality curse" problem is solved via the low dimensional representation technology. Therefore, the representation approach can overcome the deficiencies of high-dimensional data. Meanwhile, it can effectively solve intractable deficiencies among the high-dimensional data. In order to effectively analysis of the data, the low-dimensional representation of the high-dimensional data is often firstly processed and it can reflect the semantics structure information of high-dimensional data.In this paper, the high dimensional data is represented by the low-dimensional via non-negative matrix factorization algorithm which is decomposed to product of two low-dimensional non-negative matrixes. The product of the matrixes is approximated the original high dimensional data matrix. To compare with data representation approaches based on matrix decomposition, one of characteristics for non-negative matrix factorization algorithm is that the low-dimensional matrix is strictly for non-negative constraints. Non-negative matrix factorization algorithm is based on part of the representation. Therefore, it can explore local feature information from the original high-dimensional data.The main works are following as:(1)The several major data representation algorithms are introduced in this paper, including linear and nonlinear data representation algorithms and their advantages and disadvantages in data representation are analyzed.(2) The non-negative matrix factorization algorithm is presented. The advantages and disadvantages of the traditional non-negative matrix factorization algorithm are summarized. Meanwhile, the improved algorithms are introduced which characteristics are analyzed.(3) Neighborhood preserving non-negative matrix factorization (NPNMF) algorithm is presented. To deal with NMF algorithm not making use of inherent geometry information of the data, NPNM can preserve the inherent geometry of the high dimensional data. Meanwhile, Semi-supervised NPNMF (SNPNMF) algorithm is proposed which make use of the label information as hard constraints. SNPNMF can preserve the label information and improve the discriminating ability. The clustering experiments on COIL20and ORL datasets is demonstrate that NPNMF and SNPNMF is effectiveness and is superior to other algorithms.(4) Locally Consistent Constrained-Concept Factorization (LCC-CF) is presented. The Concept Factorization is an unsupervised learning algorithm which cannot take into account the label information and the intrinsic geometry structure simultaneously. LCC-CF algorithm can incorporate the label information as additional hard constraints. Meanwhile, the graph regularized can also preserve the intrinsic geometrical structure of the samples. The clustering experiments on the TDT2and Reuters-21578database demonstrate the LCC-CF algorithm is effective.
Keywords/Search Tags:Non-negative matrix factorization, concept factorization, clustering, hard constraint, neighborhood preserving
PDF Full Text Request
Related items