China’s crop germplasm resources are diverse,large in number and widely distributed.At present,how to unify the management and accurate query of the massive data information with wide distribution and complex relationship in the field of germplasm resources is still a difficult problem to solve.Therefore,introducing knowledge graph technology into the field of crop germplasm resources to visually describe various attributes and the relationships between them in the form of graphs can improve the accuracy and efficiency of information query,as well as the utilization rate of crop germplasm resources.This paper takes the germplasm resources of wheat and corn,the two most common crops in Henan Province,as the research object,and constructs a knowledge graph of wheat and corn germplasm resources in Henan Province with clear structure and strong correlation by using Python development tools and the graph database Neo4 j,and builds a data visualization system of germplasm resources based on the knowledge graph.The main research contents of the paper are as follows:(1)Data acquisition and cleaning.The data of wheat and corn varieties in Henan Province and the relationship between each entity were obtained from crop industry vertical websites,online encyclopedias,open knowledge bases,and China Crop Germplasm Resources Information Network.In order to eliminate the clutter,duplication and incompleteness of the original data,the deletion method,regression filling,mean filling and K-nearest neighbor algorithm were adopted to fill the missing data and identify and delete their isolated points to complete the data cleaning.Meanwhile,Pandas of Python,Numpy toolkit and LTP tools were used to complete the pre-processing of wheat and corn data.(2)Knowledge graph construction.The knowledge graph construction technology is applied to the knowledge graph construction of wheat and corn crops in Henan Province.According to the characteristics of wheat and corn crop expertise,the knowledge base ontology was designed with the bottom-up idea,and entities,relationships and attributes were extracted through knowledge extraction and organized into ternary groups,and then the ternary groups were imported into the database with the medium of graph database,so as to build a knowledge map of wheat and corn germplasm resources with clear structure and strong correlation.The map has 2871 entities,17092 relationships and 142 attributes,which can visually represent the relationships between entities and attributes and has advantages in querying and storing data.(3)Build a germplasm resource data visualization system based on knowledge graph.For the logic of code development,MVC pattern was adopted as the basic design pattern,supporting the use of Cypher language for visualization query,controlling the visual effect with the help of D3.js,using the front and back-end separation architecture and Spring boot framework,the data visualization system of wheat and corn germplasm resources in Henan Province was designed and developed to realize the visualization operation of knowledge,so as to promote the germplasm resources knowledge dissemination.In this paper,we study the knowledge graph construction and visualization of wheat and corn germplasm resource data and build a germplasm resource data visualization system based on the knowledge graph.The system can not only manage the knowledge of wheat and corn germplasm resources in Henan Province,but also search and visualize the corresponding knowledge with various query methods.The research and application can effectively improve the data sharing and utilization of crop germplasm resources in Henan Province,which is of great importance in maximizing the effectiveness of germplasm resources data assets and promoting the development of crop production in Henan Province. |