Font Size: a A A

Research And Application Of High Dimensional Data Manifold Structure

Posted on:2022-05-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y H ChuFull Text:PDF
GTID:1488306338984849Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of information technology,information collection methods and technologies have been gradually enhanced.People can easily obtain required data and information through computers and sensor equipment in their daily lives.The data and information that people obtain usually have high-dimensional properties.These high-dimensional data are growing exponentially.More and more massive fuzzy data and uncertain data reflect the characteristics of high-dimensional and small samples.High dimensional data appear in every corner of our lives,how to effectively represent and classify high-dimensional data has become a research hotspot in the field of machine learning.Aiming at a series of problems faced by high-dimensional data classification and high-dimensional text data representation.Based on the theory of manifold learning,this dissertation carries out the research of high-dimensional data manifold structure with the classification and representation of high-dimensional data as the research background,and applies it to extreme learning machines,broad learning system,distributed representation learning.The main contributions of this paper are as follows:(1)For the problems that the existing extreme learning machine methods cannot preserve the local manifold structure information and global geometric structure information of the data samples well,this thesis proposes a globality locality preserving extreme learning machine(GLELM).GLELM introduces the basic principles of linear discriminant analysis and locality preserving projections into extreme learning machine.It can not only maintain the internal local geometric structure of the sample,but also maintain the global of the sample.On the other hand,for the existing extreme learning machine methods ignore the local discriminative information of the data samples,this thesis proposes a discriminative globality locality preserving extreme learning machine(DGLELM).DGLELM use the idea of discrimination locality preserving projections to construct local intra-class divergence and local inter-class divergence,reflecting the local manifold structure of the data and local discriminant information.Then,this thesis introduces local intra-class divergence and local inter-class divergence into extreme learning machine.The experimental results on image datasets show that the two improved methods can significantly improve the classification performance of extreme learning machine.(2)For the problem that the limited labeled samples of hyperspectral images lead to insufficient learning of the broad learning system,this thesis solves this problem from the perspective of manifold learning,and propose a discriminative manifold broad learning system(DMBLS).DMBLS constructs intra-class graph and inter-class graph.The intra-class graph mainly reveals the similarity measurement relationship between similar data samples,which can promote the aggregation of data samples within the class.The inter-class graph makes samples of different classes as far away as possible.Then,use the intra-class graph and the inter-class graph to construct the manifold regularization framework.Finally,introduce the manifold regularization framework into the DMBLS.This thesis optimizes the projection direction of BLS output weights by minimizing the intra-class manifold structure and maximizing the inter-class manifold structure to enhance the discrimination ability of output weights.Experimental results show that the proposed method achieves good classification performance on hyperspectral image datasets.(3)For the existing distributed word representation model underestimates the words that are close to each other in Euclidean space,overestimates the words that are far away from each other,this thesis introduces manifold learning into the distributed word representation Glove model.We use manifold learning to "tile" the sample distribution group in the high-dimensional feature space to a low-dimensional space,and at the same time preserve the local position related information between the sample points in the original high-dimensional space.After tiling,it will be more conducive to the distance measurement between the word vectors.In addition,for Sentence BERT,the sentence is mainly represented in the Euclidean metric space.The geometric structure of the sentence representation and its relationship with the sentence context representation have not been carefully studied.This thesis proposes a new sentence representation method:Refined Sentence BERT.The purpose of this method is to use manifold learning to describe the local geometric structure of sentence space,discover the potential manifold structure between sentence vectors,and then find out the internal relationship between sentence vectors,so that the semantic information and geometric relationship between sentences are consistent.Experimental results show that the proposed method can obtain high-quality word vectors and sentence vectors.
Keywords/Search Tags:High Dimensional Data, Manifold Learning, Extreme Learning Machine, Broad Learning System, Distributed Text Representation
PDF Full Text Request
Related items