| Graph-structured data are ubiquitous in our life.Therefore,graph representation learning has achieved remarkable success in various tasks such as social data mining,drug discovery,and knowledge reasoning.However,with the rapid increase of data,existing graph representation learning models face significant challenges in adapting to large-scale industrial scenario.The size of graph data presents a challenge for training deep and powerful graph models,while the well-trained deep graph models face difficulties in industrial deployment due to their large parameter sizes.This paper aims to address these challenges by proposing an efficient representation learning method that requires lower memory consumption and fewer model parameters for training and forward inference on large-scale graph data.Our research can accelerate the deployment of graph representation learning technology in industrial settings,reducing costs while improving efficiency during model training and reasoning.Recently,graph representation learning still has two problems that need to be solved urgently.First,over-stacked graph neural networks on large-scale graphs have high memory usage during training and inference latency.Second,full batch gradient descent requires exceptionally high memory usage due to the explosion of node numbers,which aggravates the process of model training.To solve the above key problems,this dissertation develops efficient algorithms under different scenarios,and builds a novel scalable training and deployment framework for graph representation learning.The main contributions are summarized as follows:1.When the pre-trained deep graph model is available,we propose an adversarial knowledge distillation framework that uses trainable discriminators instead of fixed distance functions.Our framework includes representation and logits discriminators that leverage inter-node and inter-class correlations in the teacher model.The student model is treated as a generator,and iterative adversarial training is performed with the discriminator.The discriminator aims to distinguish between the student and teacher models,while the generator tries to fool it.Experiments on benchmark datasets demonstrate that the proposed adversarial knowledge distillation framework enables student models to achieve comparable performance with one-fifth of the parameters used by their corresponding teachers,outperforming other similar methods.2.When there is no pre-trained graph model available,training a graph model from scratch becomes necessary.We propose an adaptive graph partitioning method based on reinforcement learning.The method uses a differentiable combinatorial optimization solver based on the Ising model to pre-partition the graph and speed up subsequent training.Then,it adjusts specific actions of graph sampling according to the feedback from downstream tasks.This forms an end-to-end task-adaptive batch training method.Experiments demonstrate that this approach significantly reduces training costs and computing requirements for large-scale graphs while consistently outperforming other subgraph sampling algorithms in the node classification task. |