| Power text data contains a lot of important information for mining and sorting.How to effectively mine these text data and realize intelligent applications is an important task for the current intelligent upgrade of power grids.In order to mine entities from various types of electrical text data and extract the relationships between entities,the traditional methods of dictionary matching and manual extraction have been unable to cope with the ever-increasing amount of text data.In order to improve the extraction efficiency of electric power texts and realize the effective management of extracted information,this thesis mainly studies the methods of entity recognition and relationship extraction of electric power texts,and establishes a knowledge graph based on the extracted entities and relationships,and uses question classification templates to achieve Natural language input query.(1)Power entity identification.After analyzing the word-building characteristics of entities in electric power texts,this thesis adopts a set of more accurate labeling strategies to accurately label the boundaries of electric power entities.Three types of electric power texts,electric power science and technology texts,electric power operation dispatching regulations,and electric power failure work orders,are selected as the original corpus to construct an entity recognition data set.The deep learning model Bi-LSTM is combined with the statistical model CRF to construct an electric entity recognition model,which reduces the work of manually defining entity features.Use the self-built data set to train the model and compare the performance with a single model.Experiments show that the combined model can better identify entity words with longer entity boundaries,and the comprehensive evaluation index has been improved.(2)Extraction of relationships between power entities.In this thesis,by constructing a relationship matching template and combining the generalization degree and similarity evaluation indicators between entities,the generic relationship and part of the overall relationship existing between power entities are extracted,and an effective relationship connection is established for a collection of scattered power entities.Effectively improve the efficiency of extracting relationships between entities.(3)Construction of electric power knowledge graph.Organize the entities and the relationships between entities extracted from the unstructured text,and combine the relevant data in the existing power dispatch database to divide the topological structure of the dispatch part and the equipment entity part in the knowledge graph,and use Neo4 j as the data carrier to organize After the data was stored,the power knowledge map was established.(4)Natural language query question classification.In order to realize the intelligent query application of the knowledge graph,the naive Bayes classifier is trained by constructing question template and question set,and the feature vocabulary of the classifier is automatically optimized during the training process,thereby ensuring the performance of the classifier.At the same time,the length of the feature vocabulary of the classifier is effectively controlled,and the accurate classification of natural language input questions is realized. |