| With the rapid development of the big data era,the information resources on the Internet are exploding and the forms of information are becoming more and more complex and diverse,how to quickly and accurately obtain useful information from the cluttered data is an urgent problem to be solved.Knowledge graph can present a large amount of information to users in a structured way in the form of entity relation triples,so that users can get the required information precisely and quickly,greatly improving the accuracy and efficiency of information acquisition.At present,domestic knowledge graph research is in its initial stage,and the existing research mainly focuses on the construction and application of general knowledge graph,while the research on domain knowledge graph is still immature and mostly focuses on medical and social management.This may lead to certain difficulties in information acquisition and knowledge learning in the field of education,and also restricts to a certain extent the intelligent development.Basic education is the core of China’s national education system and an important cornerstone of the country’s future development,and its importance cannot be ignored.In this paper,we mainly conduct research based on the field of basic education,analyse in depth on the basis of existing research,construct a high-quality corpus in the field of basic education,and optimise the named entity recognition and the relation extraction algorithm in order to improve the accuracy and comprehensiveness of the knowledge graph,so as to achieve the construction of the knowledge graph of basic education,the specific work is as follows:(1)Dataset construction: The textbooks and teachers’ reference books related to eight basic subjects(except English)in basic education were selected as the main data source of the knowledge graph,supplemented by 50 documents on basic education downloaded from CNKI to form the basic education dataset;the data is pre-processed for the subsequent named entity recognition and relation extraction tasks,and the named entity recognition dataset is constructed using the "BMES" labeling method,and construct the basic education word set,the relation labeling is carried out for the characteristics of different disciplines to form the relation extraction dataset.(2)Named entity recognition: In order to make better use of lexicon boundary information,a BERT-IDCNN-CRF model combining the Soft Lexicon lexicon enhancement method is proposed.The BERT pre-training language model is used to train the character embedding vector,while using Soft Lexicon method to introduce lexicon information of basic education field and combine BERT character vectors with lexicon to form input vectors,which improves the input quality;in terms of feature extraction,the IDCNN-CRF model has computational advantages compared with the Bi LSTM-CRF model,which can make full use of GPU parallelism to ensure the effect while improving computational efficiency.(3)Relation extraction: To make better use of Linguistic Features,a LF-ERNIEAtt Bi LSTM Chinese relation extraction model is proposed to address the linguistic features of Chinese.Firstly,a vectorized representation of the input sequence is obtained through the ERNIE pre-training language model,while the relative position information,entity information and keyword information in the input sequence are extracted and spliced and fused to obtain a linguistic feature vector,and the text vector is spliced with the linguistic feature vector as input to obtain a more accurate representation of the basic education text semantic vector.Afterwards,feature extraction is carried out through the Bi LSTM neural network model,and the Attention mechanism is used to focus on the key features in the relation extraction task.Finally,the relation classification results are obtained through the Softmax layer.(4)Knowledge graph construction: For the constructed dataset,the named entity recognition and relation extraction algorithm proposed in this paper was used to carry out entity extraction and relation construction between entities to obtain the entity relation triples.The Neo4 j graph database is used to construct the basic education knowledge graph,and the effectiveness of the algorithm is verified,and the visualization and query functions of the basic education knowledge graph are realized. |