Joint Tensor Decomposition Based Big Data Efficient Mining Approaches

Posted on:2023-11-18

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Y Gao

Full Text:PDF

GTID:1528307043968409

Subject:Computer system architecture

Abstract/Summary:

PDF Full Text Request

The emergence and accumulation of massive multi-dimensional data has promoted the rapid development of artificial intelligence fields such as machine learning,pattern recognition,and computer vision etc.However,the existing data analysis tools mainly rely on the vectorized processing methods,which largely destroys the structure of the original data,and also leads to big data analysis process faces many problems,such as high memory load,complicated computation,and large-scale model redundancy,etc.Tensors have many advantages in efficient modeling,parsimonious representation,and association analysis of multidimensional data.Hence,the joint tensor decomposition network is selected as the main research object to carry out a series of researches on theories,technologies and algorithms from the aspects of multi-dimensional joint analysis,efficient computing,and model optimization of big data.The main research contents and innovations are as follows:Firstly,a joint tensor decomposition network is proposed to meet the requirements of joint analysis of big data.On this basis,the sparsity and security of big data are further studied.More specifically,for the characteristics of big data such as the diverse forms,high-dimensional features,high coupling and strong correlation,it is difficult for existing vectorized processing methods to capture the internal correlation of multi-dimensional data.A joint analysis and feature extraction method based on joint tensor decomposition for higher-order,high-dimensional data is proposed,which can achieve hundreds of times of feature dimension reduction while maintaining the global information of the original data,and sufficiently takes advantages of tensor network in data distributed representation and dimension reduction.Further considering the widespreadly existing problem of incompleteness in multi-dimensional data,including inherent sparse characteristics,data missing,etc.,a low-rank data imputation and completion method based on tensor representation and joint tensor decomposition network is proposed,which could realize complementary inference and completion of incomplete observation data generated at multiple stages while maintaining the complex correlation,multidimensional dependency and potential periodic characteristics of original records,and effectively improve the accuracy of existing completion methods.In addition,a high-efficient computing and joint data analysis framework based on joint tensor decomposition network is proposed to further realize the joint analysis of multi-party data under the federated framework and break the problems of data islands and privacy leakage commonly existing in the field of big data services,including joint higher-order orthogonal iterative algorithm(J-HOOI)based on the federal learning,and the federated tensor decomposition method based on J-HOOI,and the parallel and incremental computing properties of the federated decomposition model are also analyzed,so as to realize the safe and efficient analysis of multi-party data under the federation framework.Secondly,for the insufficient fitting ability of the joint tensor decomposition network,a tensor neural network is further proposed,and its structural redundancy problem is studied on this basis.More specifically,considering that the above linear tensor network has limited fitting ability,the data analysis process lacks specific goal orientation,and it is difficult to used for large-scale data analysis,etc.,a general high-order neural network based on tensor multilinear algebra theory is proposed,which effectively combines the advantages of neural network and tensor network,that is,the strong fitting ability of neural network and the simplicity of tensor network.It not only significantly improves the classification performance of tensor networks on large-scale datasets,but also achieves hundreds of times compression of various models.The optimization methods based on tensor network is proposed to solve the structure/feature redundancy problem of neural network models in processing large-scale data.By extending the traditional recurrent neural networks to high-order scenarios,high-order recurrent tensor neural networks based on multi-linear algebra theory are proposed.The representation and processing ability of recurrent neural network for multi-dimensional sequences is effectively improved through replacing the original input-hidden and hidden-hidden basic linear transformations with tensor multi-linear transformations,and the classification performance is improved by up to 6%.For the structure and feature channel redundancy of the convolutional networks represented by the fully convolutional network,a fully convolutional network optimization method based on joint tensor decomposition model is proposed,and a lightweight and efficient multi-step feature generation module for replacing traditional convolution operation is also constructed.The parameters are effectively compressed by more than 80% on the premise of ensuring the model segmentation performance.Experimental results show that the proposed methods can not only improve the performance of the traditional neural networks,but also greatly reduce the training cost of various models.A series of data joint analysis,high-efficient calculation and low-rank model optimization methods based on tensor network proposed in this thesis also greatly promote the promotion and application of tensor multi-linear algebra theory in the era of big data.

Keywords/Search Tags:

Big Data, Federated Calculation, Tensor Calculation, Model Optimization

PDF Full Text Request

Related items

1	The Establishment Of High Performance Computing Cluster And The Calculation On The Property Of CdGa₂S₄
2	Calculation Optimization And Experimental Study On Hydride Vapor Phase Epitaxial Growth Of Thick Gallium Nitride Films
3	Research On Time Series Data Analysis And Network Compression Based On Tensor Calculation
4	Development Of A Post-Processing System For The Calculation Of Blood Flow In Left Ventrical
5	The Esign And Implementation Of Chronic Disease Management Information System In The Environment Of Big Data
6	Research For Calculation Model Based Simulator
7	Research And Application Of Sentence Similarity Calculation Based On Distributed System
8	Design And Implementation Of The Financial Statistic Reporting Platform
9	Based On The Calculation Model Of The Image Recognition Mechanism Of Classical Receptive Field
10	Design And Implementation Of Large Postal Data Analysis System