Font Size: a A A

Research On User Semantic Matching Within The Field Based On Rules And Contrastive Learning

Posted on:2024-02-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y L BaoFull Text:PDF
GTID:2544307067996499Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
In recent years,under the background of medical general health,people pay more attention to health management and disease prevention.Meanwhile,information has exploded due to the rapid development of the Internet,which makes it difficult for them to obtain correct medical and health information efficiently.How to return professional answers to users in the most efficient way is one of the current research hotspots on the combination of artificial intelligence and health.Therefore,this paper develops a semantic matching method within the medical and dietary health field.It could provide powerful technical support for application scenarios such as dialog systems and intelligent customer services,which has a huge market with big potential.Firstly,this paper uses the dataset of encyclopedia questions in the field of medical treatment and diet health for text clustering.Regular expressions are used to represent different question styles according to different categories,then patterns are obtained based on slots and rules.Secondly,models for semantic matching based on the Siamese network are trained through the CHIP dataset,and the contrastive learning method is used for optimization.Finally,the rule-based method is combined with the deep-learning model for the text-matching module,with its validity tested.For the rule-based matching,through text clustering,25 categories with their question styles are summerized and converted into regular expressions,which are combined in terms of slots to obtain 13 rule-based question patterns.For the part of similarity model matching,baseline models including Siamese CNN,Siamese LSTM and SBERT are trained.All perform well on the test set,among which SBERT is the best with an accuracy of 0.863 and F1 value of 0.867.Through data augumentation and the contrastive loss layer,the random swap method has improved the accuracy and F1 value most by1.39% and 1.50% respectively.The accuracy and F1 value of the best model are 0.875 and 0.880,with this model used for sentence embedding and similarity calculation.After testing the performance and effectiveness of the combined text-matching module,results show that it can improve the accuracy and efficiency of user semantic matching with good portability and expansibility.
Keywords/Search Tags:Semantic Similarity, Rule-based Matching, Contrastive Learning, Siamese Neural Network
PDF Full Text Request
Related items