Chinese Text Semantic Matching Based On Deep Learning

Posted on:2023-05-03

Degree:Master

Type:Thesis

Country:China

Candidate:L Zhang

Full Text:PDF

GTID:2558306620971159

Subject:Computer application technology

Abstract/Summary:

With the increasing development of information technology,the data in cyberspace is growing explosively and a large number of data duplication problems arise.Text matching is a necessary technology to solve these problems.It is also widely used in various application systems,such as information retrieval,recommendation of information flow,and intelligent question answering systems.Text matching has achieved good results in the field of English,but it still needs to be deeply explored in the field of Chinese,due to the constraints that the semantics and structure of sentences are complex,and the part of speech and emotion are diverse.For Chinese text matching,this thesis has carried out the following work based on deep learning:(1)A multi-granularity and internal-external correlation residual model is proposed to mine the deep semantic information of Chinese.Firstly,because both the char and the word in Chinese have semantic information,the text is represented from different granularities to obtain the fine-grained semantic information of sentences.Then,the classical Siamese architecture and residual connection are used to fully retain the semantic information of each coding layer.At the same time,the internal-external correlation coding layer uses the attention mechanism to calculate the internal-external correlation features of sentences,to fully obtain the correlation features of sentences.The experimental results show that the model proposed in this thesis has achieved good results in text-matching tasks of Chinese.(2)Text matching models based on pre-training and emotional features are proposed.Firstly,a pre-training and fine-tuning model based on Roberta is proposed,which carries out the MLM pre-training task and the semantic matching task at the same time.Then,we design an occlusion strategy to match the semantic similarity of the text.We also enhance data on the data set and add confrontation learning in the training process.It effectively expands the data scale and makes full use of the powerful semantic representation ability of the pre-training model.Furthermore,a text-matching model based on emotional features is proposed.The model extracts the semantic features and emotional features of the sentence at the same time.Then,the semantic features and emotional features are fused to calculate the similarity of the text.The advantages and disadvantages of the model are analyzed by experiment.

Keywords/Search Tags:

Chinese text-matching, Multi-granularity and internal-external correlation, Pre-training, Roberta, Data enhancement

Related items

1	Sentiment Analysis Of Text Based On RoBERTa Model
2	Research On Text Matching Methods Based On Multiple Granularity And Siamese Interaction
3	Chinese Text Sentiment Analysis Method Based On Text Data Enhancement And ELECTRA Language Model
4	Research On Deep Learning Error Correction Method Of Chinese Text
5	Research On Multi-Document Summarization Based On Multi-Granularity Fusion And Knowledge Enhancement
6	Research On Chinese Short Text Classification Based On Pre-trained Language Model
7	Research And Application Of Text Matching Based On Deep Learning Learning
8	Research On Multi-granular Text Matching Method Based On Deep Learning
9	Research On Answer Matching Methods Of Chinese Question Answering Systems Via Integrating Multi-source External Knowledge
10	Chinese Text Categorization Based On Correlation Rules Mining