Font Size: a A A

Intrinsic Low-dimensional Data Representation Based On Low-rank Tensor Decomposition And Statistical Manifold

Posted on:2018-01-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:H Y FanFull Text:PDF
GTID:1360330623950350Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In recent decades,processing of large-scale datasets have been applied in many areas because of the development of data acquisition techniques.For example,medical science,molecular biology,geological data processing,and remote sensing data processing.Though these large datasets can provide rich knowledge,they can cause information redundancy and curse of dimensionality.These datasets typically have high extrinsic dimension,but often exhibit a low intrinsic dimensional structure.Thus,it's necessary to extract the parsimonious structure from the high-dimensional data,with little or no loss of content information.The study of the intrinsic low-dimensional structure can help not only substantially reduce dimension,but also suppress noise and interference and improve data quality.Most of the existing high-dimensional data have natural tensor structures.Based on tensor decomposition tools,the linear low-dimensional structure of high-diemnsional data can be extracted.However,conventional tensor decomposition methods cannot model the specific intrinsic characteristics of data,such as sparsity and low rank property.In this thesis,we exploit the advantages of tensorial representations and develop tensor decomposition techniques which can consider the specifical structure of data and obtain nore effective representations.When the low-dimensional structure is global nonlinear,we turn to manifold geometry instead of using tensor analysis.By using the local statistical information and Fisher-Riemannian metric,the high-diemnsional data are mapped into a low-dimensional statistical manifold.The contributions of this thesis are summarized as follows:1.An efficient and scalable algorithm for tensor principal component analysis(PCA)is proposed,which is called Linearized Alternating Direction Method with Vectorized technique for Tensor Principal Component Analysis(LADMVTPCA).Usually the tensor PCA is viewed as a low-rank matrix completion problem via matrix factorization technique,and nuclear norm is used as a convex approximation of the rank operator under mild condition.However,most nuclear norm minimization approaches are based on SVD operations,which brings prohibitive computational complexity in large-scale problems.Different from traditional matrix factorization methods,LADMVTPCA utilizes the vectorized technique to formulate the tensor as an outer product of vectors,which greatly improves the computational efficacy compared to matrix factorization method.In the experiment part,synthetic tensor data with different orders are used to empirically evaluate the proposed algorithm LADMVTPCA.Results have shown that LADMVTPCA outperforms matrix factorization based method.2.A Low Rank Tensor Recovery(LRTR)method for the recovery of corrupted tensorial data is proposed.The denoising task is formulated as a low-rank tensor recovery problem from Gaussian noise and sparse noise.Traditional low-rank tensor decomposition methods,e.g.,Tucker and PARAFAC decompositions,are generally sensitive to sparse noise.In contrast,the proposed LRTR method can preserve the global structure of data and simultaneously remove Gaussian noise and sparse noise.The proposed method is based on a new tensor Singular Value Decomposition(t-SVD)and tensor nuclear norm.The NP-hard tensor recovery task is well accomplished by polynomial time algorithms.The convergence of the algorithm and the parameter settings are also described in details.Preliminary numerical experiments have demonstrated that the proposed method is effective for low-rank tensor recovery from Gaussian noise and sparse noise.Experimental results also show that the proposed LRTR method outperforms other denoising algorithms on real corrupted hyperspectral data.3.To furthermore overcome the ill-posedness of corrupted tensorial data reconstruction,addtional regularization terms,apart from sparsity and low-rank constraints,are considered.Specifically,most signals have smooth and regular structure,e.g.,piecewise smoothness.Hence,a compositely regularized statistical model named LRTF-SSTV is proposed to explicitly encode the structural information–that is,low-rank property,sparsity and piecewise smoothness of the data.Specifically,a Low Rank Tensor Factorization(LRTF)model is used to separate the low-rank clean tensorial data from sparse noise.The total varization(TV)regularization is used to preserve the spatial piecewise smoothness and remove Gaussian noise.To address the limitations of TV,we use SSTV regularization which can simutaneously consider the local spatial structures and correlations.Both simulated and real data experiments demonstrate that the proposed SSTV-LRTF method achieves superior performance as compared to the state-of-the-art TV regularized and low-rank based methods.4.To extract the non-linear low-dimensional structure of data,a statistical manifold based method is proposed.Specifically,the data can be mapped into a statistical manifold by using local statistical information.Then,Fisher-Riemannian metric can be derived for the manifold based on Riemannian geometry.Generally,the statistical manifold is controlled by several implicit variables which is far less than the sparisity of the data.Besides,the manifold defined on the local statistical information is robust to noise,data missing and occlussion.Simulation and real-data experiments have demonstrated the effectiveness of the proposed method.
Keywords/Search Tags:intrinsic low-dimensional structure, low-rank tensor decomposition, total variation, Riemannian manifold, local statistical information
PDF Full Text Request
Related items