| In recent years,knowledge question answering has become a hot research direction in the field of natural language processing.Traditional question and answer systems directly use webpage text as a search database,and use shallow matching to obtain answers,which is difficult to meet the refined needs of users.Data sparsity and shallow semantic matching often result in low recall and low accuracy of knowledge question and answer.In order to solve these problems,this thesis starts with data augmentation of knowledge graph and question and answer corpus,and focuses on related technologies of crop planting Knowledge Question and Answer System.Carry out in-depth research,the main contents are as follows:(1)Data augmentation of the crop planting knowledge map and question database.First,extract entities and relationships from data with different structures,and integrate the acquired knowledge with the knowledge in the existing knowledge map,and finally obtain The triples of crop planting fields are converted into graph database storage.Secondly,the GPT-2(Generative Pre-Training2.0)algorithm is used to augment the question and answer corpus to provide more data support for the question and answer system.(2)Propose a text matching algorithm based on Bert+DSSM(Bidirectional Encoder Representation from Transformers+Deep Structured Semantic Model).The vector representation of a sentence will not only be affected by the semantics of the sentence itself,but the context information and the position information of the word in the sentence will also affect the accuracy of the vector representation of the entire sentence.Therefore,based on the DSSM(Deep Structured Semantic Model)algorithm,this thesis proposes a Bert+DSSM algorithm,using the Bert model to replace the bag of words model,adding contextual information and location information to the vector representation of the sentence,so that the final vector representation can be expressed more comprehensively The meaning of the sentence is calculated,and then the similarity is calculated,which further improves the performance of the algorithm.(3)Design and implementation of a question-and-answer system for crop planting based on data augmentation.The crop planting knowledge map and question and answer corpus after data augmentation are used as the basic knowledge base.Bert+DSSM is used to match the user’s questions,and the matched answers are fed back to the user.In addition,the Chinese DBpedia knowledge base is used as the system’s supplementary knowledge The library is designed to improve the recall rate of the question answering system. |