| With the continuous and rapid development of the national economy,the domestic industrial structure has changed,and the development of the national economy is unbalanced,resulting in great differences in the types and frequency of production safety accidents in different industries and regions.In addition,with the promulgation of the Regulations on Reporting and Investigation of Production Safety Accidents,safety supervision and supervision departments and large and medium-sized enterprises have accumulated a large number of accident raw data.Although safety management personnel use various safety production information systems for accident statistics and classification,they do not fully tap the value of production safety accident cases.At present,the main problem faced by safety management departments and researchers is no longer the acquisition of accident data,but how to analyze the internal relationship and extract useful information from massive accident data,and apply the analysis results to practical work,so as to truly serve safety management and accident prevention.In addition,based on the analysis of accident data,it is also necessary to integrate various information such as economic development,geographic information,and industrial technology level.The scale of data increases sharply,and advanced computer technology is needed for analysis and processing.With the increasing maturity of machine learning and natural language processing technology,large-scale text data analysis has attracted increasing attention.Based on the above understanding,this thesis took the massive production safety accident case data as the research object,used the theoretical knowledge of computer science,management science,statistics,economics,social science,and other related disciplines,based on natural language processing technology and word frequency statistics,realized the automatic classification of accident data,deeply excavated the historical data of accidents and important information such as economy,policy,laws,and regulations,found the potential characteristics,laws and their internal relations behind the data,and provided the reference for enriching and perfecting the safety production theory.The main research contents and results were as follows :(1)Using natural language processing technology to realize the automatic classification of accident data industryBased on the Chinese word segmentation technology and word frequency statistics,the keywords of production safety accidents in various typical industries were extracted from the massive production safety accident case data of nearly 20 years published by the former State Administration of Work Safety.The reliability of automatic classification of accident data based on keywords was quantitatively evaluated by recall and precision.The keywords of coal mine accidents were “ coal mine ” “ outburst ” “level” “ coal mining”“ working face” “ tunneling” and “ permeable ”.The recall rate could reach 95.65% and the precision rate was 96.2%.For non-coal mines that contained “ quarries ” or“ limestone ”or “ copper ore ”or “ iron ore ” or “ gold ore ” or “ middle ” or “ mining ” or“ roadway ” or “ mine ” or “ mine ” but did not contain “ coal mine ” “ outburst ” or“ level ” “ coal mining ” could be used as an accident classification method,and the accuracy and recall rate of the accident classification method based on keywords could be increased to more than 85%.Traffic accidents could choose “ collision ” “ truck ”“ road ” “ driving ” “ traffic accident ” “ KM ” “ bus ” “ heavy ” “ car ” “ collision ”“ section ” “ motorcycle ” “ car ” “ kilometer ” “ travel to ” “ rear-end ”and other 46 words as keywords,whose recall rate and precision rate were 94.2% and 94.9%,respectively.Could better meet the purpose of automatic accident text classification.(2)Based on the accident automatic classification design and development of accident statistics query systemAutomatic classification of accident text was realized according to keywords,and the spatial clustering method was used to study the accident statistics results.It was found that the k-means clustering algorithm could express the spatial distribution characteristics of production safety accidents more accurately,and the clustering results of coal mine accidents were closely related to the geological structure of coalfields in China.The spatial clustering results of traffic accidents and construction accidents showed that they were closely related to regional economic development levels.On the basis of the automatic classification and spatial distribution characteristics of accident,query and statistics system design and development production safety accident,could realize the grade,accident types and different time scale and space scale of multidimensional query can work,and relying on the latest visualization tool to query statistics results in various ways.(3)Investigating the spatial-temporal characteristics of production safety accidents based on data mining technologyBased on the established mass production safety accident inquiry system,the temporal and spatial distribution characteristics of each typical industry were researched.By summarizing the production safety accidents in China,it was found that the number of production safety accidents and the number of deaths had been declining since 2003,indicating that the situation of production safety in all industries in China was developing steadily.According to the detailed analysis and research on the production data of four typical industries,including a coal mine,transportation,construction,and chemical industry,the conclusions were drawn as follows: coal mine accidents and traffic accidents had similar change patterns with years.They both rose first,then declined rapidly,and finally declined slowly.Every year from July to October was the peak period of coal mine accidents,while the peak period of traffic accidents had a long time span,including January,February,May,August,and October;The occurrence of coal mine accidents and traffic accidents had obvious regional characteristics.The most frequent coal mine accidents occurred in southwest China,and the most frequent traffic accidents occurred in southwest and South China.For construction accidents and chemical accidents,the trend of the two changes with years was not particularly significant,and the curve fluctuated.Every Year,August was the peak period of construction accidents,while the peak period of chemical accidents not only included August but also included January,March,and April,with a relatively large time span.According to the spatial distribution characteristics of accidents,there were more accidents in construction and chemical industries in East China.(4)Exploring the correlation between the characteristics of mass production safety accidents and economic developmentAccording to the massive production safety accidents published by the former State Administration of Work Safety,this thesis summarized the overall trend of national production safety accidents and the basic situation of economic development,and discovered that economic development and production safety accidents were mutually restricted and influenced.The influence of science and technology,human resources,and economic policy on production safety was analyzed,and it was found that there was an obvious correlation between economic development factors and production safety,but different economic factors had different influences on production safety in different periods and stages.The relationship between regional economic development and regional production safety accidents was discussed.The types of production safety accidents were different in different economic regions.The economic industrial structure was also an important aspect affecting production safety.Different industrial structures had different safety risks,resulting in different types of production safety accidents and risk degrees.Based on the correlation and coupling between safety system and economic system,the correlation characteristic system of accident index(accident death number,death rate of 100 million yuan,death rate of 100 thousand people)and economic index(resident consumption,energy consumption,education expenditure,average salary of employed personnel and expenditure of scientific research expenditure)was established.Through the correlation analysis,it was discovered that the comprehensive correlation coefficient between scientific research investment and educational expenditure and accident index was obviously higher than other economic indexes.At the same time,the grey prediction model was improved and the Gaussian function accident prediction model was established.The overall error of the model was small,and it could keep a better prediction effect even in the late prediction period,and the method of solving model parameters was simple.(5)Analyzing the influence of safety policy on the characteristics of mass production safety accident dataAccording to the data and statistics,the evolution of safety policies issued since the founding of the People’s Republic of China could be divided into four stages: initial establishment and exploration stage(1949-1977),restoration,consolidation,and improvement stage(1978-1991),system and mechanism reform stage(1992-2002)and innovative development stage(2003-present).Various security policies were sorted out and classified from mandatory,industry weight,technical degree,and extensive degree respectively.On this basis,China’s security policy system was constructed.OLS regression model was established to quantitatively describe the relationship between the number of national safety policies and the number of deaths in production safety accidents from 2000 to 2016.Significance,heteroscedasticity,autocorrelation,and residual tests were passed to illustrate the rationality of the model.The relative error between the fitting value and the real value was within 5%,indicating that it was feasible to use the model to quantitatively describe the number of safety policies and accidental deaths.The model showed that from 2000 to 2002,the influence coefficient of the number of safety policies on the number of accident deaths was 91.540.From 2003 to 2005,it was-63.362;it changed to 138.75 from 2006 to 2008;from 2009 to 2016,it kept constant,but,the number of accident deaths decreased by 27,368,indicating a marked improvement in the safety situation.Based on safety policy,four different policy index variables were extracted,that is,strong institution,broad degree,technology degree,and industry power weight.The relationship between safety policy and work safety was explored with the safety mortality rate of 100 million YUAN of GDP,and the unit root,co-integration,and Granger test are conducted.VAR was established by using Eviews10 to analyze the impulse response function of various policy indicators on the death rate of production accidents.The negative correlation of strong institution,wide degree,industry weight,and technology degree weakened successively,showing a weak-strong-weak law over time.In this paper,a production safety accident query and statistics system is established through data mining technology.Based on the system can be relatively perfect analysis of various typical industry production safety accident space-time distribution characteristics.According to the time and space distribution of accidents,it can provide help for formulating refined safety production policies in various regions.In addition,this paper also explores the correlation characteristics between mass production accident data and economy and policy.The impulse response function of each index of safety production policy and the death rate of production accident was established.The impulse response function can be used to effectively judge the impact of policies on accidents,so as to provide a theoretical basis for improving production safety policies. |