| In recent years,Online Social Networks have become increasingly popular all over the world.It is becoming an indispensable tool for people’s daily communication,information acquisition,and hot events discussion.People join multiple different Social Network platforms,such as Weibo and Twitter,and can enjoy various services simultaneously.Users who register on multiple networks act as bridges between networks.User alignment across Social Networks aims to seek the same natural person from many virtual accounts on different Social Networks.Due to its potential practical value in applications such as business recommendation across networks,link prediction,and cyberspace security,the study of user alignment has attracted widespread attention from both academia and industry.Current researches on the user alignment mainly faces four problems: First,user attributes are diversified,and there is a lack of general methods for processing multi-type attribute texts.Second,high-level semantic information hidden in user attribute texts is difficult to capture.Third,traditional methods can hardly tackle the imbalance between user attribute features and network structure features.Finally,difficulty in obtaining labeled data hinders the improvement of model performance.This thesis summarizes and elaborates the unified framework of the user alignment problem to address the above challenges and focuses on the user alignment algorithm across Social Networks.This research includes the following three aspects:(1)Aiming at the problems of diversified user attributes and difficulty in capturing highlevel semantics in Social Networks,this work proposes a novel semi-supervised user alignment model(MARUA)based on multi-level attribute embedding and Regularized Canonical Correlation Analysis.This method effectively captures the characteristics from multi-types of attribute text and high-level semantic features through multi-level attribute embedding,and develops a linear projection based on Regularized Canonical Correlation Analysis to map user features of different Social Networks into a unified latent vector space.This mapping also minimizes the distance between the same users on different networks.Compared with traditional supervised learning methods,MARUA dramatically reduces the demand for annotations required for model optimization,and saves costs for data collection and model training.(2)In view of the imbalance of attribute features and structural features,and difficulty in obtaining labeled data,this research proposes a user alignment model(JARUA)based on the joint embedding of multi-granular user attributes and relationships.The model extracts multitypes of user attribute features by automatically identifying the granularity level of attribute text and adopting Representation Learning methods,and then uses Graph Attention Networks to learn alignment-oriented network structural features.Finally,JARUA designs an iterative training algorithm that makes full use of unlabeled samples and obtains high-quality automatically labeled data through a filtering mechanism,thereby achieving better model performances.(3)This thesis constructs two real-world social network data sets(Weibo-Douban and DBLP17-DBLP19).The data set contains multi-types of user attribute information,and has the structural characteristics of real social networks.Therefore,the data sets support the testing and evaluation of various kinds of user alignment methods.We verify and evaluate the proposed two models on these data sets,and the experimental results prove the effectiveness and superiority of our models. |