| In recent years,the prevalence of diabetes has been rising linearly.my country has become the country with the largest number of diabetic patients.As the number of diabetic patients continues to grow,there will be more diabetes in the future,which is gradually eroding the physical and mental health of the people.However,diabetes is a long-term chronic disease,especially diabetes can cause damage to large blood vessels and capillaries,and endanger nerves,eyes,feet,etc.,causing a series of complications such as retinopathy,diabetic foot,diabetic nephropathy,these complications are serious It threatens the health of patients and affects the quality of life of patients.What’s more worrying is that diabetes is a lifelong metabolic disease that can only be controlled but cannot be treated thoroughly,which further increases the economic and spiritual burden of diabetic patients.With the continuous increase of diabetic patients,more and more scientific researchers are engaged in it.The scientific literature on diabetes research has shown an upward trend year by year.Unfortunately,the scientific research literature is stored in text form,and unstructured data is difficult to be Use it quickly and efficiently.In order to solve this problem,this article proposes to extract the entities and relationships from the diabetes literature to construct a knowledge map,and extract the key content of the literature.On the one hand,it can enrich the knowledge base for diabetes treatment and prevention,and on the other On the one hand,more medical literature can really play a role and value through practical actions.The content of this article contains three parts:First of all,this article gives a general explanation on the construction theory of knowledge graphs.During the construction process,each step of knowledge extraction,knowledge fusion,knowledge processing and knowledge storage will specify the technologies that need to be used,as well as the purpose and functions of these technologies.,Explained several schemes in named entity recognition and relation extraction in the knowledge extraction process in detail,and finally compared the selection of different databases in the knowledge storage process in many aspects to lay the foundation for the further development of subsequent chapters.Secondly,in the Diabetes Entity Recognition chapter,the 16 entities and 15 relationship types included in the diabetes data set used in this article are introduced in detail.In the data preprocessing stage,the evaluation criteria for the cleaning and labeling of unstructured text and the model are introduced.A detailed explanation is given.Finally,the training results of the BILSTM+CRF model on the diabetes data set are presented experimentally,and through the comparison and summary of various models,it is concluded that the model has the best effect on this data set.Finally,in the Diabetes Relation Recognition chapter,by adding the ATTENTION mechanism,constructing multiple feature projects,and passing the 10-fold crossvalidation training data set,the BILSTM+ATTENTION+CRF combined model achieved good results,in order to verify the model selection and features For the pros and cons of engineering selection,the experiment uses supplementary models and feature engineering methods to conduct longitudinal comparative analysis.The results show that the combined feature engineering and model perform best.At the end of the chapter,the Neo4 j graph database is used to model the diabetes relationship and entities,and build a diabetes knowledge graph.The storage and query of the graph database provide a data foundation for the subsequent application of the diabetes knowledge graph. |