Font Size: a A A

Research On Document Association Analysis Method For Cross-border Ethnic Cultur

Posted on:2023-11-07Degree:MasterType:Thesis
Country:ChinaCandidate:C J ChenFull Text:PDF
GTID:2555306797973349Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In the era of big data,the network has become the main carrier of cultural communication,and it is of great significance to use intelligent information technology to obtain and analyze the big data of frontier ethnic culture in time.Cross-border ethnic document association analysis can save time and effort to help people analyze the differences between cross-border ethnic cultures and obtain the association between cross-border ethnic cultural text data in time.However,cross-border ethnic cultural document association analysis is a specific domain task,and the semantics of text data are complex,so it is difficult to explore the association between cross-border ethnic cultural text data only through the existing deep learning model.Based on the actual needs of cross-border national culture,Focusing on the characteristics of cross-border ethnic cultural text data,based on cutting-edge technologies such as hierarchical attention network,graph convolution neural network and convolution neural network,this paper studies the tasks of cross-border ethnic cultural text classification,crossborder ethnic cultural text clustering and cross-border ethnic cultural text sorting.The main work is as follows:(1)Cross-border ethnic text classification method based on domain knowledge mapUsing text classification technology to distinguish cross-border ethnic cultural text data is the basis of cross-border ethnic cultural document association analysis task.Based on the lack of external knowledge guidance in cross-border ethnic cultural text data,the ability to identify important information in the original text is insufficient,which leads to inaccurate classification of cross-border ethnic cultural categories.A cross-border ethnic text classification method based on domain knowledge map is proposed.The cross-border ethnic cultural knowledge map is used to expand the semantics of cross-border ethnic entities in the text,and the category semantic features of the text are enhanced by the category features of entities in the knowledge map.Using the advantages of title-assisted text to lock keywords,supplement and summarize the text,it combines it with the text,and combines the extracted feature information at different levels to assist classification,thus alleviating the problem of cross-border ethnic cultural categories.Experimental results show that the proposed cross-border ethnic text classification method based on domain knowledge map has achieved better classification results than the baseline model.(2)Cross-border ethnic text clustering method based on domain knowledge mapIt is the key task of cross-border ethnic cultural document association analysis to find out the relationship between texts from large-scale cross-border ethnic cultural text data.Most of the existing text clustering models are based on the semantic features of texts,which can not capture the relationship between texts.Based on the characteristics of cross-border ethnic cultural text data,a cross-border ethnic text clustering method based on domain knowledge map is proposed.The local feature vector of the text is extracted after expanding the entity semantic information by using the cross-border national cultural knowledge map,A cross-border ethnic cultural document association analysis diagram including text,theme and entity is constructed,Heterogeneous graph convolution neural network is used to learn the global feature representation of rich cross-border ethnic cultural text data,and variational self-coding network is used to fuse the local and global feature information of the text,and the potential feature representation of cross-border ethnic cultural text is used for clustering.Experimental results show that the proposed cross-border ethnic cultural text clustering method based on domain knowledge map has achieved better clustering results than the baseline model.(3)Cross-border ethnic text sorting method based on document topic feature informationRetrieval of cross-border ethnic cultural text data is an important part of crossborder ethnic cultural document association analysis task,among which sorting is an important part of text retrieval task.Based on the problem that the existing text sorting methods use semantic similarity between texts to sort and ignore the topic feature information between texts,which leads to incomplete retrieval results,a cross-border ethnic text sorting method based on the topic feature of documents is proposed.The knowledge representation model is used to vectorize the triple information in the crossborder ethnic cultural knowledge map,It is integrated into the text to supplement and retrieve the semantic information of the text entity,The text clustering method is used to capture the relationship between text data,and the document topic features are integrated into the retrieved text and the text to be retrieved.The similarity matrix between the retrieved text and the text to be retrieved is constructed,and the text data related to the query text is retrieved and sorted according to the text similarity score.Experimental results show that the proposed cross-border ethnic text ranking method based on document topic features has achieved better results than the baseline model.(4)Design and implementation of cross-border ethnic cultural text association analysis prototype systemBased on the above research results,a prototype system of cross-border ethnic cultural document association analysis is designed and implemented.It integrates data processing module,semantic expansion module of cross-border ethnic cultural entities,cross-border ethnic cultural text classification module and cross-border ethnic cultural text retrieval module,providing a visual information acquisition platform for relevant users.
Keywords/Search Tags:Cross-border national culture, Cross-border ethnic cultural knowledge map, Text classification, Text clustering, Text sorting, Document association analysis
PDF Full Text Request
Related items