Font Size: a A A

Research And Application Of Classification Algorithm For Education Policy Texts

Posted on:2020-12-16Degree:MasterType:Thesis
Country:ChinaCandidate:T WangFull Text:PDF
GTID:2417330575465414Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the country's vigorous promotion of education and the rapid development of information technology,education policy data continues to expand,and online education policy data has been quantified.How to efficiently manage massive amounts of educational policy data to extract more valuable information has become very difficult.At present,relying on traditional manual methods to collect,classify and manage various types of educational policy data,the workload is very large,and the task is very complicated,it's difficult to complete.Text classification technology in natural language processing can automatically classify text data more efficiently on the basis of saving labor cost resources.Therefore,this thesis applied text classification technology to the automatic classification of education policy texts,quickly locates and accurately finds education policy data,and then realizes the information management and visual analysis of education policy data.This thesis takes education policy texts as the research object,researches on the collection of educational policy data,text classification and data visualization,designs and implements the education policy text classification visualization system finally.The main work of this thesis is as follows:1.Aiming at how to comprehensively obtain a large amount of educational policy data,after analyzing the webpage structure and characteristics of pkulaw database website,this thesis designs and implements the educational policy data collection module based on web crawler technology.The module solves the identity authentication problem by means of simulated login,adopts the idea of breadth-first search algorithm,and combines Beautiful Soup,regular expression and database to realize the collection of education policy data.In short,the education policy data collection module solves the problem of requesting the pkulaw website to be too frequent and the education policy data to be incomplete,achieving comprehensive and efficient collection of education policy data.2.In view of how to accurately classify education policies,this thesis proposes a text classification algorithm that combines the title and body attention mechanism.Based on the characteristics of the education policy document with the title and the content,so the algorithm models the document by title and content.On the representation of the feature words,using the recurrent structure to extract the contextual semantic information of the feature words,it may be better able to disambiguate the meaning of the word;on the title and the content text representation,use max-pooling techniques to preserve important latent semantic information in the text;on the representation of the entire document,use the attention mechanism to assign attention weights to the title and the content.The document is then represented to take full advantage of the education policy's title information.By comparing with the existing classification algorithms,the superiority of the algorithm in the classification of educational policy texts is proved.3.In order to strengthen the information management in the field of education policy,this thesis designs and implements a classification and visualization system of education policy texts.On the one hand,by applying the text classification algorithm combining title and content attention mechanism in the system,the automatic classification of education policy is realized,the pressure of education policy management personnel is reduced,and the efficiency of education policy management is improved;on the other hand,through the analysis and mining of educational policy data,the geographical distribution information and category quantity information of educational policy data can be displayed,so that the overall data information of the education policy can be visually displayed to assist the education policy authorities in making decisions.In summary,this thesis firstly designs the education policy data collection module to achieve an efficient and comprehensive grasp of the education policy data.Secondly,this thesis proposes an education policy classification algorithm based on the title and body attention mechanism,which can make full use of the semantic information of feature words,and reasonably assign weights according to the importance of the classification results based on the title and the content.By comparing with other algorithms on the educational policy dataset,it is proved that the proposed algorithm is better than the comparison algorithm.Finally,the design and implementation of the education policy text classification and visualization system can not only improve the performance of education policy classification,but also improve the efficiency of education policy management,and facilitate the information management of education policy data in China.
Keywords/Search Tags:Education Policy, Text Classification, Attention Mechanism, Data Visualization
PDF Full Text Request
Related items