Efficient Algorithms For Representation Learning

Posted on:2020-04-30

Degree:Doctor

Type:Dissertation

Country:China

Candidate:J F Chen

Full Text:PDF

GTID:1368330626964473

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Representation learning is one of the core research problems in the field of artificial intelligence.Many successful models,including latent variable models,which model the joint distribution of observable and latent variables,and deep models,which learn hierarchical representation in an end-to-end fashion,can be understood as special cases of representation learning.In the big data era,by utilizing rich information in the data,representation learning achieves much better model performance than hand crafted features.However,the large amount of data poses severe challenges on representation learning.Noises in stochastic algorithms,instability of approximated optimization algorithms,and high time complexity,affect the efficiency of representation learning algorithms.Based on variance reduction,reject sampling,and memory access optimization,we study efficient algorithms for two classes of models,latent variable models and deep representation learning models.We also study applications of these algorithms for text analysis,generative models,and graph node classification.Novelties include1.We propose a variance reduced stochastic EM algorithm for latent variable models,and represent theoretical results on convergence speed and global convergence.2.We propose a cache efficient O(1)sampling algorithm for topic models,which is 5-15 times faster than previous algorithms,and is scalable to hundreds of millions of documents,millions of topics,and ten thousand CPU cores.3.We propose a partially collapsed Gibbs sampling for hierarchical topic model,scaling to datasets which is 5 orders of magnitude larger than previous datasets.We propose coordinate descent and reject sampling algorithm for supervised topic models,which is 4 times faster than previous algorithm.4.We propose population matching discrepancy,a sample-based distance between two distributions.We prove the consistency results for population matching discrepancy,and discuss its applications in domain adaptation and deep generative models.5.We propose an efficient control variate based stochastic training algorithm for graph convolutional networks,and presents the proof of convergence and experimental results.The algorithm converges 7 times faster than previous algorithm.

Keywords/Search Tags:

Representation Learning, Latent Variable Models, Topic Models, Sampling Algorithms, Graph Convolutional Networks

PDF Full Text Request

Related items

1	Research On Fast Gibbs Sampling Topic Inference Algorithms For Topic Models
2	Personalized News Recommendation System Based On Topic Modeling And Hierarchical Latent Variable Models
3	Information Diffusion Algorithms Based On Latent Space Vector Models
4	Accelerated importance sampling with applications to dynamic latent variable models
5	Influence analysis of some complicated latent variable models
6	A Research On Text Vector Representations And Modelling Based On Neural Networks
7	Research On The Graphical Models In Intelligence Data Processing
8	Research On Ranking Topic Models And Their Applications
9	Asymmetric-Prior Author Topic Models
10	Research On Application Of Deep Learning Models For Feature Representation And Classification