| Text similarity is the degree of semantic similarity between texts.It's used to evaluate whether different texts express the same semantic information.Text similarity calculation has a wide range of applications in the fields of intelligent customer service,search engines,and recommendation systems.Text similarity calculation has a long research history.The original method was based on the statistical information of the text.However,early methods could not accurately understand the semantic information of the text.Subsequently,people proposed to use deep learning methods to calculate similarity and achieved good performance.At present,most of the relevant references that use deep learning methods to calculate text similarity are based on English texts.The references that do Chinese text similarity calculations are relatively scarce.This paper designs a model of Bi GRU + Attention mechanism,which is used to calculate the similarity of Chinese short text.The main framework of the model is Encoder-Decoder.The Bi GRU model as the basic model at both ends of the framework can solve the problem of long-sequence dependence,and can well capture bidirectional semantic information.Add the attention mechanism to the model to improve the accuracy of text similarity tasks.In this model,the Batch Normalization layer and the Dropout layer will be added appropriately to improve the model convergence ability and prevent overfitting.Some methods will be taken to enhance the attention to boost the inference ability of the Decoder layerIn this paper,a total of four data sets are used and four experiments are conducted.The first three experiments are based on the ATEC competition data set.Due to the uneven distribution of positive and negative samples,this experiment uses the F1 Score to compare the generalization ability of different models.In order to eliminate the effect of sample imbalance as much as possible,the weights of positive and negative samples are adjusted in the loss function.The experimental results show that after the sample weights are adjusted,the model generalization ability is greatly improved.Meanwhile,the model generalization effect is not much different at the word level and character level.The fourth experiment was carried out on the remaining three datasets.This article compares the designed model with commonly used models in the industry and lists the time required for model training.Experimental results show that the model in this paper can converge faster than other models. |