| With the rapid development of the chemical industry,chemical products have gradually integrated into people’s daily life and played an indispensable role in it.In recent years,hazardous chemical accidents often occur due to the illegal operation of the staff or the negligence of the management personnel.In order to reduce the probability of hazardous chemical accidents and improve the disposal efficiency after accidents,the construction of hazardous chemical accident case text databases is usually used as a solution in the field.Realize case information sharing through the hazardous chemical accident case library,help relevant practitioners to summarize experience and lessons from historical cases,adopt appropriate emergency strategies to reduce accident hazards when accidents occur,and strengthen safety management awareness to reduce the possibility of accidents when accidents do not occur.However,the construction of the domestic hazardous chemical accident case database is only sufficient for the storage of text,and there are shortcomings such as confusing text format,subjective classification standards,simple retrieval methods,and time-consuming and labor-intensive manual operations.In view of the above-mentioned problems in the domestic hazardous chemical case text database,thesis constructs a hazardous chemical accident case text data set by sorting out the hazardous chemical accident text information,and proposes a text classification algorithm that satisfies the characteristics of the hazardous chemical accident text data.Hazardous chemical accident text automatic classification function.The main contributions of thesis are as follows:(1)In view of the lack of text data sets of hazardous chemical accident cases,thesis constructs a data set consisting of 11,674 hazardous chemical accident case texts through the collection,noise processing and labeling of website data.According to the link of the accident,the accident text is divided into five categories: production accident,storage accident,use accident,transportation accident,and other accidents,which provides data guarantee for the subsequent text classification algorithm research.(2)In view of the problem of how to automatically classify the category information of hazardous chemical accident case texts.Thesis proposes a text classification model based on BERT-BiLSTM-Attention.The BERT model is used to obtain the semantic representation of the text,the BiLSTM is integrated to capture the semantic fusion features of the text context,and the Attention mechanism is added to realize the distribution of contribution to the text content.The experiment compares the Word2Vec-BiLSTM,BERT-CNN,and BERT-BiLSTM models,and the results show that the method improves the F1 value by 4.59%,3.35%,and 1.12%,respectively.(3)In view of the low accuracy of some category labels in the BERT-BiLSTM-Attention model,thesis adopts the RoBERTa pre-training model based on the dynamic mask mechanism for text training to obtain text vector representation,and integrates the BiGRU model to capture bidirectional text information,combined with the Attention mechanism to achieve category determination.The experimental results show that the classification accuracy of the model reaches 92.02%,and the problem of low accuracy for some category labels has also been effectively solved.The text classification model proposed in thesis solves the problems of chaotic classification standards and time-consuming labor in the hazardous chemical accident case database,and facilitates relevant practitioners to learn accident prevention measures and emergency strategies,thereby reducing the probability of hazardous chemical accidents. |