Font Size: a A A

Research On Visualization Method And Hidden Hazard Association Analysis Based On HSE Big Data Of Petroleum Enterprises

Posted on:2020-08-29Degree:MasterType:Thesis
Country:ChinaCandidate:Z Q WuFull Text:PDF
GTID:2381330614464986Subject:Safety engineering
Abstract/Summary:PDF Full Text Request
The HSE event data of the petroleum industry contains a large amount of unstructured text data.In order to find the potential relationship between events hidden in a large number of text descriptions,guide the security management of enterprises and prevent the occurrence of security incidents.The text data aiming at the unstructured text data,uses natural language processing techniques such as word segmentation,combines with the association rule algorithm,constructs a mining method of security risk text data,explores the internal causes of security risks and compiles text mining software for petrochemical industry.(1)In this paper,Introduce word segmentation technology,adopt semi-supervised word segmentation method,errata high-frequency professional vocabulary and establish a user dictionary,compile a user dictionary of about 5,000 words,integrate the existing stop word list,and join the stop word list of petroleum industry,gets the stop word list of 1926 words,obtains the better word segmentation effect,and uses the TF-IDF algorithm to extract the keywords after the word segmentation.(2)For the poor applicability of the association rule algorithm and the data processed in this paper,establish a model for text mining using Apriori algorithm,combined with word segmentation technology.The Apriori algorithm is used for the excellent applicability of Boolean data,and the text is transformed into a transaction set containing one word,so as to mine,and 128 strong association rules are obtained,and the association rules are showed out in the form of network diagrams by using Network X combined with matplotlib and other modules.The generated association rules combined with the visualization pictures are analyzed,and some problems in the security aspect of the enterprise are obtained,and suggestions are made.(3)In order to get the results we care about and improve the efficiency of analysis,this paper uses Python language and Gui development tools pyqt,and Qtdesigner and other modules to develop text mining visualization software.The software integrates the text mining methods used in the previous two chapters and adds visual images.At the same time,it adopts the operation mode with human-computer interaction function,allowing users to adjust in real time during text mining and intuitively obtain security risks correlation and has good practicability.
Keywords/Search Tags:text mining, word segmentation, user dictionary, association rules, visualization software
PDF Full Text Request
Related items