| Nonferrous metallurgy is an important pillar industry in China’s industrial production,characterized by complex process flows and numerous upstream and downstream connections,which to some extent leads to difficulties in knowledge collection in the nonferrous metallurgical industry.This paper uses knowledge graph construction technology to focus on the construction of datasets,named entity recognition,and relationship extraction involved.Finally,a knowledge graph visualization platform for the field of nonferrous metallurgy was established,achieving the structural and systematic transformation of knowledge in the nonferrous metallurgy industry,within enterprises and among enterprises.(1)Constructed a data set for the field of nonferrous metallurgy.This paper uses web crawler technology to collect more than 20,000 semi-structured nonferrous metallurgical texts from five data sources,and uses the BIO annotation method to annotate the data set after data cleaning.At the same time,a solution was formulated for the problem of many nested entities and loose structure in the field of nonferrous metallurgy,and the format definition and modification of the data set were carried out accordingly.(2)A MEB(MRC-ERNIE-Bi LSTM)model structure based on machine reading comprehension(MRC),knowledge-enhanced semantic representation model(ERNIE)and bidirectional long-short-term neural network(Bi LSTM)is proposed.Based on the full extraction of text features by the ERNIE and Bi LSTM models,this model designs a multi-input information fusion mechanism based on the MRC framework,so that the model can learn the prior knowledge in the label.After that,it is output in a multi-layer nested entity recognizer designed in this paper to improve the recognition accuracy of nested entities.It solves the problems that existing named entity recognition methods cannot fully extract text semantic features,do not make full use of prior knowledge in tags,and have poor nested named entity recognition effects in named entity recognition in the field of nonferrous metallurgy.(3)A BA(Bi LSTM-Attention)relation extraction model based on Bi LSTM and Attention is proposed.In this paper,an entity pointer output device is designed in the input stage of the model,which outputs all entities in the input entity set in pairs,and uses convolutional neural network(CNN)and Attention to fuse the original text and entity features.After the feature extraction of Bi LSTM and Attention,this paper transforms the traditional relationship generation method into a relationship matching mode at the time of output,and optimizes the nested entities on the basis of identifying the relationships between all entities as much as possible.It solves the problems that the existing relation extraction methods do not make full use of the original text features in relation extraction in the field of nonferrous metallurgy,fail to recognize all relations as much as possible,and do not optimize for nested entities.(4)On the basis of constructing the nonferrous metallurgy field data set,named entity recognition model,and relation extraction model,this paper identifies the entities and relations of the nonferrous metallurgy field texts in Gansu Province,and uses the Neo4 j graph database for storage.After integrating the knowledge and structured data from the third party,this paper builds a nonferrous metallurgy field knowledge map visualization platform for Gansu Province in four aspects: industrial knowledge overview,enterprise knowledge,industrial knowledge and product data. |