Font Size: a A A

Efficient Algorithms For Representation Learning

Posted on:2020-04-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:J F ChenFull Text:PDF
GTID:1368330626964473Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Representation learning is one of the core research problems in the field of artificial intelligence.Many successful models,including latent variable models,which model the joint distribution of observable and latent variables,and deep models,which learn hierarchical representation in an end-to-end fashion,can be understood as special cases of representation learning.In the big data era,by utilizing rich information in the data,representation learning achieves much better model performance than hand crafted features.However,the large amount of data poses severe challenges on representation learning.Noises in stochastic algorithms,instability of approximated optimization algorithms,and high time complexity,affect the efficiency of representation learning algorithms.Based on variance reduction,reject sampling,and memory access optimization,we study efficient algorithms for two classes of models,latent variable models and deep representation learning models.We also study applications of these algorithms for text analysis,generative models,and graph node classification.Novelties include1.We propose a variance reduced stochastic EM algorithm for latent variable models,and represent theoretical results on convergence speed and global convergence.2.We propose a cache efficient O(1)sampling algorithm for topic models,which is 5-15 times faster than previous algorithms,and is scalable to hundreds of millions of documents,millions of topics,and ten thousand CPU cores.3.We propose a partially collapsed Gibbs sampling for hierarchical topic model,scaling to datasets which is 5 orders of magnitude larger than previous datasets.We propose coordinate descent and reject sampling algorithm for supervised topic models,which is 4 times faster than previous algorithm.4.We propose population matching discrepancy,a sample-based distance between two distributions.We prove the consistency results for population matching discrepancy,and discuss its applications in domain adaptation and deep generative models.5.We propose an efficient control variate based stochastic training algorithm for graph convolutional networks,and presents the proof of convergence and experimental results.The algorithm converges 7 times faster than previous algorithm.
Keywords/Search Tags:Representation Learning, Latent Variable Models, Topic Models, Sampling Algorithms, Graph Convolutional Networks
PDF Full Text Request
Related items