Font Size: a A A

Graph Representation Learning:Spectral Theory And Self-Supervised Learning

Posted on:2022-09-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:J Z QiuFull Text:PDF
GTID:1528307325967539Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Graph-structured data is everywhere around us,such as social networks,biological networks,and transportation networks.Building representation learning algorithms for graph-structured data is crucial if we want to understand,analyze and reason on graphs.In recent years,we have witnessed the huge success of graph representation learning,including node embedding and graph neural networks.This thesis develops a spectral theory for node embedding,designs large-scale node embedding algorithms and systems,explores self-supervised learning for graph neural networks,and studies the application of graph representation learning in social influence prediction.First,we build a spectral theory for node embedding.Random walk-based node embedding algorithms,exemplified by Deep Walk and node2 vec,have been very successful in many graph learning tasks.However,the sampling mechanism behind is not well studied,which limits the application and development of these algorithms.We analyze their behavior in the cases of infinite sampling and finite sampling: With an infinite number of samples,we unified these algorithms into a matrix factorization framework,and reveal their relationship to graph spectral theory.With a finite number of samples,by analyzing the convergence rate of the co-occurrence matrix sampled via a Markov chain,we prove the sample efficiency and time complexity of these algorithms.Moreover,to analyze the case of finite sampling,we prove a Markov chain matrix Chernoff bound,extend classic Chernoff bound to random matrices with Markovian dependence.Second,based on the above spectral theory,we develop cost-effective,scalable,and high-quality algorithms and systems that are capable of embedding graphs with hundreds of billions of edges on a single machine.As the rising of large-scale social and information networks,there is an urgent need for large-scale node embedding.We propose to conduct direct matrix factorization rather than the gradient-based method.We then co-design the algorithm and the system by introducing techniques such as random walk matrix polynomial sparsification,high-performance graph processing system,parallel hashing,and high-performance randomized singular value decomposition.Third,we explore self-supervised learning for graph neural networks.Inspired by the recent advances in pre-training and self-supervised learning from computer vision and natural language processing,we design Graph Contrastive Coding,a self-supervised graph neural network pre-training framework to capture the universal network topological properties across multiple networks.Extensive experiments on three graph learning tasks and ten graph datasets suggest that our proposed framework can achieve competitive or better performance to its task-specific and trained-from-scratch counterparts.Finally,we further study the application of graph representation learning in social influence prediction.By formalizing social influence prediction as a graph classification problem,we propose an end-to-end graph representation learning model which achieves significantly better performance than traditional methods with hand-craft features and domain knowledge.
Keywords/Search Tags:Graph Representation Learning, Graph Neural Networks, Graph Spectral, Self-supervised Learning, Social Influence
PDF Full Text Request
Related items