Font Size: a A A

Design And Implementation Of Search System Based On Natural Language Processing And Knowledge Graph

Posted on:2023-10-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y ChenFull Text:PDF
GTID:2558307100975849Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the process of production and operation,enterprises will generate a large amount of data,which needs to be managed,indexed,and searched through an enterprise search engine.However,most of the current enterprise search engines are based on keyword matching for information retrieval,which cannot form a good interaction with users and cannot understand the intent of the query sentence input by the user.In response to the above problems,the subject designs and develops a search system based on natural language processing technology and knowledge graph,which can understand the semantics of sentences input by the user,so that the search engine can realize semantic search at the knowledge level.The main research work and innovation points of this thesis are as follows:(1)According to the problems existing in the enterprise-level search system and the goal of the system,comprehensive demand analysis of the system is carried out from the functional and non-functional perspectives.In terms of functionality,the system needs to implement data processing,data indexing,knowledge graph construction,and information retrieval for enterprise-level data;in terms of nonfunctionality,the system needs to have features such as scalability,stability,and userfriendliness.The overall architecture of the system is designed,and each functional module of the system is designed in detail.(2)The construction method of the knowledge graph is studied.A knowledge graph construction scheme suitable for enterprise-level data is formulated,and the method of triple extraction is studied.Based on the Bert-base pre-training model and the Ro BERTa pre-training model,the triple extraction experiment was carried out respectively,and the results were compared.It is concluded that the triple extraction model trained based on Ro BERTa is better.The system implements the triple extraction method based on the Ro BERTa model,as the pre-work of knowledge graph construction.The knowledge management function is developed around the triple extraction design,which is used for data marking,model training,data review,knowledge extraction,and data storage,and then realizes the construction of the knowledge graph.(3)The semantic search method based on knowledge graph is studied.Firstly,the method based on question template matching is studied,and the design and implementation are carried out.Then,the improvement of this method is studied,and the "relationship matching" method is proposed.This thesis introduces the semantic matching technology and implementation method used in this method,compares this method with the method of "question template matching",and implements this method.According to the analysis,the "relationship matching" method proposed in this thesis is easier to obtain the user’s search intent.(4)Based on the analysis and design of the system requirements,the system is implemented concretely.Based on the triple extraction method,the core function that building a knowledge graph for enterprise data by type is realized.Through the semantic representation ability of the knowledge graph,the "relationship matching" method proposed in this thesis is realized,which enables the system to recognize the query intent of the sentence input by the user to a certain extent,and then realizes the semantic search function based on the knowledge graph.
Keywords/Search Tags:search engine, knowledge graph, semantic search, triple extraction
PDF Full Text Request
Related items