Font Size: a A A

Research On Data Mining Exploration In The Field Of Power Text

Posted on:2021-04-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y N XuFull Text:PDF
GTID:2392330614463759Subject:Control engineering
Abstract/Summary:PDF Full Text Request
The rapid development of information technology represented by intelligent tools in recent years has not only accelerated the integration of industrialization and informatization,and has driven the growth of the national economy.At the same time,these are also profoundly changing the way of life and production.Due to the difference in specific fields,the expression of information text also has obvious domain characteristics.This brings great difficulties to the description of related information and the use of query tools to identify the field of information,and at the same time to accurately express the semantic information is very hard.With the further development of smart grid control,power-related companies have accumulated a large amount of text data generated in the power field,and at the same time,there have been more and more dissertation and reports on the power field in the Internet.However,most existing text data mining researches focus on sentiment-related classification.However,there are few reports on text mining research in the industrial and power fields.How to effectively use these text data has become a hot spot in current research.Mining has always been a difficult issue for the information industry.Researchers must not only have a solid Internet foundation,but also have a full understanding of relevant fields of knowledge.This has also given text mining and processing tasks such as power and industrial fields.In order to solve the abovementioned difficult problem of power text data,this dissertation studies three aspects: extracting keywords in power text field,classification of power grid complaint text,corpus and dictionary construction of power text.1.This dissertation introduces the existing text categories in the power field in detail.For the extraction of keywords in the power field,this dissertation uses the related data sets in the power field and data about the power industry crawled from the Internet to analyze After the work of new word discovery and keyword extraction,a considerable number of characteristic words related to the power field were obtained.Attempts to use these words as dictionaries for text segmentation in the power field.Experiments with segmentation show that compared with the traditional Chinese general dictionary,the dictionary established in this dissertation can significantly improve the segmentation effect of power text.2.A classification experiment was carried out on some complaint texts of power grid companies.The naive Bayes classifier,SVM(Support Vector Machine)classifier,logistic regression classifier,etc.were used to classify these data.This dissertation explores the effect of these machine learning algorithms on the classification of power domain texts,and compares the effects of these algorithms on power complaint texts.3.In response to the lack of a publicly available power corpus and power domain dictionary,this dissertation use the texts and data sets of related power fields crawled from the Internet to design a corpus in the power field.It is divided into electric network text corpus and electric power professional terminology.Compile and design the electric power text domain dictionary,give the electric power text domain dictionary construction method,compile tens of thousands of words of electric power domain dictionary.
Keywords/Search Tags:Dictionary of Electric Power, Participle, Power text classification, Machine learning, Keyword extraction, Power corpus, Data mining
PDF Full Text Request
Related items