| As a major coal producing country,coal mine accidents are the biggest threat to coal mine safety production.With the development of "Internet +Industry",more and more scholars have devoted their energy to the application of computer technology to the research of coal mine accidents.By summarizing the existing massive coal mine accident case text data,combining the knowledge map theory and technology,constructing a coal mine safety production knowledge map,mining the relationship between accident entities,monitoring the coal mine production process,so as to avoid the occurrence of coal mine accidents.Named entity recognition is the first step in constructing a knowledge map.It lays a foundation for the construction of a knowledge map of coal mine safety production.The main research contents of this thesis are as follows:(1)The entity tagging corpus in the field of coal mine accidents is constructed.This paper uses coal mine accident cases in the past two decades as a data set,combined with domain experts and actual application requirements to classify entities,and constructs a coal mine accident field entity labeling corpus containing four types of entities: time,place name,mine name,and accident type Coal Mine Corpus;(2)Aiming at the problems of insufficient semantic expression ability of traditional word vectors,poor parallel computing ability of RNN,and poor ability of CNN to extract long-distance features,a named entity recognition algorithm based on coal mine accident cases is proposed.The algorithm consists of a word embedding layer,a CNN layer and a CRF layer.The embedding layer uses the ALBERT pre-training language model,adds position vectors and text vectors on the basis of the original word vectors,and trains in a self-supervised manner to obtain vector representations;the CNN layer uses CNN as the basic network,four threelayer iterative expansion convolutions are used to complete feature extraction.Each expansion convolution module is composed of expansion convolutions with expansion rates of 1,1,and 2.Finally,CRF is used to predict CNN layer The results are constrained.Experiments show that this model is better than the other three neural network models in terms of accuracy and time efficiency;(3)Aiming at the problem that the feedforward neural network layer in the traditional Transformer occupies a large number of parameters,which leads to the long training time of the Transformer,this thesis proposes the FS-Transformer-CRF model based on the word embedding layer of the ALBERT-IDCDA-CRF entity recognition algorithm.Using N pairs of key-value memory vectors to replace the attention layer and feedforward neural network layer with a single attention layer,a simpler structure of FS-Transformer is constructed.Finally,through comparative experiments,the feasibility and effectiveness of the FS-Transformer-CRF model were verified;(4)Based on the FS-Transformer-CRF named entity recognition model,a named entity recognition system for coal mine accident cases is designed.The system realizes the automatic recognition of named entities,and can also manually correct the wrong results. |