| In China,soybean cultivation and growth are threatened by over 30 disease and approximately 100 pests,causing significant economic losses.The professional knowledge of soybean pest control mostly exists in professional books,scientific papers and other literature resources.In actual field operations,farmers can not easily obtain the latest professional knowledge,and the efficiency of information utilization is low.How to use computer technology to assist farmers in obtaining relevant information in real-time has become an increasingly prominent demand.In view of the above information gap,this paper proposes to use natural language processing technology to automatically extract professional knowledge from domain literature resources,clean,sort and integrate them,build a graph of domain knowledge,and on this basis,provide knowledge retrieval services for field operations.The main content is summarized as follows:(1)Information source screening and annotation of datasets.In response to the lack of domain datasets,this article selects the book "Primary Color Atlas of Soybean Pests and Disease" as the data source to create a self built entity and relation datasets.Referring to previous research and book descriptions,the relations between soybean related entities are divided into five categories,namely damage sites,disease symptoms,prevention and control measures,morphological features,and others.The self built datasets in this article is manually annotated based on the annotation format of the benchmark datasets for open fields.(2)Research on methods for extracting domain entity relations.This study implemented two relation extraction models based on pipeline structure and joint learning structure,and conducted comparative analysis and ablation experiments on open domain benchmark datasets and domain datasets.Based on the pipeline structure,a relation extraction model based on CNN,PCNN,and BERT was implemented.The experimental results show that the BERT model outperforms CNN and PCNN in the field of soybean pests and diseases,with an F1 value of 0.9849.Based on the joint learning structure,a unified model for entity recognition and relation recognition tasks was established,utilizing the association information between the two tasks to reduce error accumulation,and an SPNet relation extraction model was implemented.However,data sparsity in domain knowledge seriously affects the performance of joint relation extraction model.In response to the problem of data sparsity,the introduction of data augmentation methods effectively improves the accuracy of the joint entity relation extraction model.Comparing two types of learning structures and models,pipeline structure models can be used to construct initial knowledge graphs,accurately identifying entity relations between annotated entities,while joint learning structures can simultaneously identify entities and relations,and can be used to expand knowledge graphs.However,it is necessary to pay attention to the negative impact of data sparsity.(3)Research on the construction method of domain knowledge graph.On the basis of relation extraction task,use Neo4 j knowledge graph building tool to build domain knowledge graph,use Cypher language to process data and build domain knowledge graph,and build web-based knowledge graph retrieval service.This topic takes soybean diseases and insect pests as the sample field to study the construction methods and key issues of the domain knowledge graph.With deep learning as the technical framework,based on the research of existing benchmark datasets,mainstream methods and models,the key issues in the construction of knowledge graphs,such as the extraction of entity relations for domain texts and the storage,representation and retrieval of entity relations for knowledge graphs,were studied.The dynamic and personalized knowledge retrieval of soybean pest control was realized,the connection between theoretical knowledge and fields was strengthened,and precision agriculture was effectively assisted. |