| With the application and development of internet technology,search engines have become a tool for people to search for unknown problems in their daily lives.Traditional search engines use keywords in the text of a question as the basis for matching and return a set of relevant web pages as search results.Taking the agricultural field as an example,it is necessary to understand a series of complex professional information such as planting methods,required light,water,soil,diseases,and pests for different categories of crops.Using traditional search engines for information retrieval requires a lot of time and effort to collect and organize from massive unorganized data,which can cause considerable information collection burden for users.In 2012,Google proposed the prototype of the Knowledge Graph built on the basis of triples,which established the foundation for the next generation search engine by constructing various real-world existing entities and their relationships in the form of triples.As an important part of the national development process in the agricultural sector,the establishment of a question answering system based on the knowledge graph can effectively provide reference answers to various questions in the production process and save the time required for information searching.In addition,the knowledge graph is highly scalable,with more domain knowledge being added as the domain develops,and the links between new things and old things can be extended based on the triple structure to form a fuller knowledge structure.In addition,there are problems that require multiple inferences to get an answer,and graphical structuring of the clues in the context of the problem can be effective in reasoning out the answer.This thesis is based on the knowledge graph structure and the model design for agricultural domain Q&A and multi-hop reasoning Q&A,so that it can achieve specific needs in different scenarios and provide users with more convenient and efficient information retrieval services.The main work and innovations of this thesis are summarized as follows:(1)To address the problem that the lack of structured data in the agricultural knowledge graph based question answering system makes it impossible to construct a knowledge graph,a method is proposed to collect agricultural encyclopedia knowledge using web crawlers and two-step data preprocessing.This method collects and categorizes different crops,plants,and livestock based on their specific subcategories in the agricultural field.It associates knowledge based on the different attributes included in each agricultural entity category,completes the structured triple construction and storage of each entity and attribute description.(2)To address the problem of constructing a question-answering system based on an existing agricultural knowledge graph,a method is proposed to analyze the question using a combination of named entity recognition and multi-label text classification,and then query the question in a structured database.The method stores the entities in the knowledge graph in the process of building the agricultural knowledge graph,determines the scope of the existing knowledge graph query.Firstly,it determines whether the question text contains entities within the query scope,and if it contains entities within the query scope,then proceeds to the next step of multi-label text classification to determine the attributes of the entities required to be queried in the question.Since multiple attributes are asked in the same question text,this thesis constructs a dataset based on the existing entity attributes for the question mode,and constructs single-label and multi-label question texts by linking the entities to the question mode.Then,the generated dataset is used to train the existing deep learning model so that it can determine the attribute categories of the questions asked in the questions.Finally,by combining the results of named entity recognition and determination of the category of question text attributes,the text is parsed into a query statement for the graph database,which is then searched for and returns the corresponding answer.The question answering system has been experimentally proven to be able to answer various types of attribute questions in the collected data in the agricultural field.(3)To address the limitations of reasoning about questions using only a single paragraph or text in a single-hop reasoning question answering,this thesis proposes to use graph structures to model the association between questions and contextual paragraphs,and then train graph neural networks to reason about answers to multi-hop questions based on pre-trained language models to encode the modelled text.The graph structure reasoning approach is similar to the human brain’s reasoning process for questions.Firstly,the method not only associates entities in the question text with a given contextual paragraphs,but also associates entities in the question text with an external Wikipedia database to provide clues to support the search for the answer.After completing the multi-hop reasoning modeling of the graph structure,the encoding and reasoning training of the graph structure representation is achieved using the ELECTRA pre-trained language model with strong performance and low computational cost,as well as the optimized graph attention mechanism network,to achieve multi-hop question answering reasoning. |