Font Size: a A A

Analysis Of Readability Of Chinese Newspaper And Periodical Textbook Based On SVM Algorithm

Posted on:2023-01-12Degree:MasterType:Thesis
Country:ChinaCandidate:Y N JiaFull Text:PDF
GTID:2555307163492504Subject:Chinese international education
Abstract/Summary:PDF Full Text Request
Legibility refers to the degree or nature of the text that can be understood,that is,the difficulty of the text.The study of legibility originated in the United States,mainly to evaluate the comprehensibility of English texts through the method of legibility formulas.The study of legibility in our country started from English teaching,and borrowed and absorbed American legibility theories.After entering the new century,the teaching of Chinese as a foreign language has developed vigorously,and the research on the legibility of Chinese text has also entered a new stage.Regardless of whether the study of legibility in our country is a formula method or an algorithm model combined with machine learning,most of the language materials are comprehensive course textbooks or test questions of Chinese as a foreign language.There are few studies on the legibility of the language materials for newspapers and periodicals,but newspaper texts Compared with the comprehensive course textbooks,there are certain differences in vocabulary and sentence patterns,and newspaper texts have a strong reading timeliness.Its easy-to-read automatic grading can facilitate learners to supplement reading materials in time,and improve Chinese learning efficiency and learning confidence..Therefore,it is necessary to study and analyze the key factors that affect the difficulty of newspapers and periodicals.The thesis selects all the reading texts of the four difficulty levels of "Newly Editing Newspapers,Learning Chinese-Chinese Newspaper Reading" as the language materials.Starting from the three levels of Chinese characters,vocabulary and sentences,it summarizes each reading text.For the specific characteristics of one level,the chi-square test method was used to select 8 influencing factors with high correlation,and the support vector machine(SVM)algorithm was used to establish a language model to measure the legibility of newspaper texts.The main conclusions of this paper are as follows: First,eight factors,such as Blevel characters,C-level characters,average word frequency,average number of clauses,class/symbol ratio,new words,idioms,average sentence length,etc.Readability has a greater impact,and the lexical level has a greater impact on newspaper texts than sentences.Second,support vector machine(SVM)and algorithms such as decision tree and random forest establish language difficulty evaluation models.The SVM hit rate is the highest,which is 74.08%,and the adjacent accuracy(±Acc)is92.9%.Third,the algorithm model applied to newspaper texts has a prediction accuracy of 76.9% for the comprehensive textbook "Boya Chinese",indicating that this model is also applicable to other foreign Chinese texts.In the following research,machine learning algorithms such as SVM can be used to Chinese text readability assessment and research.
Keywords/Search Tags:Text Readability, Newspapers and periodicals, Teaching Chinese as a second language, Machine learning
PDF Full Text Request
Related items