Font Size: a A A

Research Of Network Bad Word Discovery Model Based On Designing Idea Of AlphaGo

Posted on:2019-04-05Degree:MasterType:Thesis
Country:ChinaCandidate:L NieFull Text:PDF
GTID:2428330548466861Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The rapid development of the Internet and the convenience of the Internet have made network languages develop and update at an unprecedented speed.The era of information explosion has long since dawned.Internet texts have also achieved unprecedented development as the main carriers of information dissemination.Web languages are changing with each passing day,but at the same time the language of networks is vulgarized.Increasingly,due to the immaturity of regulatory technology,the supervision of network language is difficult to cover all aspects,which has also caused great challenges of the monitoring of low-level cyber language.With the advent of the era of big data,the ideas and technologies in artificial intelligence are maturing.One of the most prominent representatives is the blockbuster battle between AlphaGo and humanity's top Go player Li Shizhen.However,in essence,AlphaGo and Li Shizhen's "war of the century" is not all a leap in technology,but a change in thinking.AlphaGo's success depends on its two brains:the "drop selector" and the "game evaluator",namely the Policy Network and the Value Network.Inspired by AlphaGo's thoughts,this article will derive the idea of the Policy Network and the Value Network.to the field of public opinion monitoring,design a"drop selector" of text classification for this research area,and use the "game evaluator"for bad word discovery.Based on this research theme,the main points of this article are as follows:(1)Study AlphaGo's design ideas and related technologies,focus on studying the ideas and concrete realization of AlphaGo's Policy Network and Value Network,and transfer the two ideas to the bad vocabulary discovery in the public opinion monitoring field.(2)Design the "Move Picker" suitable for the low-level cyber language to classify the text based on the enlightenment of Policy Network:Compare the classification effect of Naive Bayes Text Classifier and Support Vector Machine(SVM)Text Classifier in short texts of news commentary.Improve the SVM algorithm,use the step window setting to obtain the appropriate penalty factor and kernel function parameters,and obtain a more efficient decision function;combine the naive bayes classification algorithm to classify the text as the "policy network" model.Test the effectiveness of the improved algorithm and the classification efficiency of the combined model.(3)Based on the inspiration of idea of Value Network,we design the "Position Evaluator "which is suitable for the low-level cyber language to discover bad words:calculate the semantic similarity between words based on the word vector,define the word similarity threshold esim,and the SO-PMI threshold esp,compare the accuracy rate,recall rate,and F-worth of experimental data at different thresholds.The results of the questionnaire survey are used to test the fit between the model's judgment results and natural semantic scenes to determine appropriate thresholds.Combine these thresholds to a "value network" for bad word discovery.
Keywords/Search Tags:Policy Network, Value Network, Text Classification, Semantic Similarity, Bad Word Discovery
PDF Full Text Request
Related items