Font Size: a A A

Chinese Text Semantic Matching Based On Deep Learning

Posted on:2023-05-03Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhangFull Text:PDF
GTID:2558306620971159Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the increasing development of information technology,the data in cyberspace is growing explosively and a large number of data duplication problems arise.Text matching is a necessary technology to solve these problems.It is also widely used in various application systems,such as information retrieval,recommendation of information flow,and intelligent question answering systems.Text matching has achieved good results in the field of English,but it still needs to be deeply explored in the field of Chinese,due to the constraints that the semantics and structure of sentences are complex,and the part of speech and emotion are diverse.For Chinese text matching,this thesis has carried out the following work based on deep learning:(1)A multi-granularity and internal-external correlation residual model is proposed to mine the deep semantic information of Chinese.Firstly,because both the char and the word in Chinese have semantic information,the text is represented from different granularities to obtain the fine-grained semantic information of sentences.Then,the classical Siamese architecture and residual connection are used to fully retain the semantic information of each coding layer.At the same time,the internal-external correlation coding layer uses the attention mechanism to calculate the internal-external correlation features of sentences,to fully obtain the correlation features of sentences.The experimental results show that the model proposed in this thesis has achieved good results in text-matching tasks of Chinese.(2)Text matching models based on pre-training and emotional features are proposed.Firstly,a pre-training and fine-tuning model based on Roberta is proposed,which carries out the MLM pre-training task and the semantic matching task at the same time.Then,we design an occlusion strategy to match the semantic similarity of the text.We also enhance data on the data set and add confrontation learning in the training process.It effectively expands the data scale and makes full use of the powerful semantic representation ability of the pre-training model.Furthermore,a text-matching model based on emotional features is proposed.The model extracts the semantic features and emotional features of the sentence at the same time.Then,the semantic features and emotional features are fused to calculate the similarity of the text.The advantages and disadvantages of the model are analyzed by experiment.
Keywords/Search Tags:Chinese text-matching, Multi-granularity and internal-external correlation, Pre-training, Roberta, Data enhancement
PDF Full Text Request
Related items