| In today’s vigorous construction of socialist spiritual civilization,the construction of language civilization on campus has increasingly received unprecedented attention.The construction of language civilization on campus not only promotes the development of students’ learning and life,but also has an important impact on their psychology and values.Uncivilized speech on campus often insults the personality of students,harms students’ self-confidence and self-esteem,affects the development of mental health,and is not conducive to students’ learning and growth.The phenomenon of uncivilized speech on campus should arouse the concerns of schools and parents aspect of high attention.Therefore,how to effectively detect bad speech on campus has become an urgent problem for the healthy growth of students and the construction of campus language civilization.Due to the rapid development of deep learning in recent years and its wide application in the field of speech recognition,the task of detecting bad information in audio data is mainly based on keyword matching.Such methods generally use acoustic models or language models to first recognize speech data as text information,and then use keyword matching methods to detect whether there is bad information in the sentence.However,this method needs to prepare a lexicon of sensitive words in advance,and then establish a corresponding audio recognition model based on the sensitive word set retrieval system,which greatly increases the workload and reduces the recognition efficiency.And in the actual environment,it is difficult to obtain samples of uncivilized speech on campus,so the text proposes to use the method based on generative confrontation network to identify uncivilized speech on campus.The specific work is as follows:(1)Consult relevant papers and literature on detecting bad information in audio,analyze the current research status of identifying bad speech information at home and abroad,according to the characteristics of uncivilized speech samples on campus that are not easy to collect,and normal samples can often be obtained in actual anomaly detection tasks It is determined that this paper uses an anomaly detection method based on a semisupervised generative adversarial network to identify uncivilized speech on campus.(2)Collect teacher classroom recordings and campus uncivilized remarks as experimental sample data,study the related technologies of audio data preprocessing,and then perform data preprocessing on the collected experimental samples to make training sets and test sets for this experiment.(3)Use the semi-supervised anomaly detection model proposed in this paper to train and test on the self-made data set.After analyzing the experimental results,add the selfattention mechanism module on the basis of the original model,and use the improved anomaly score function.The detection results difference between normal data and abnormal data has significantly increased,which improves the model’s discrimination between campus uncivilized speech samples and teacher classroom recording samples.The AUC value of the overall improved model detection results has increased from 0.908 to 0.938.It shows that the improved algorithm can effectively identify samples of campus uncivilized speech. |