Font Size: a A A

Auto-Evaluation Study On The Readability Of Chinese As A Foreign Language Reading Materials

Posted on:2019-04-18Degree:MasterType:Thesis
Country:ChinaCandidate:W W SunFull Text:PDF
GTID:2405330548466992Subject:Education Technology
Abstract/Summary:PDF Full Text Request
The arrival of teaching Chinese as a foreign language and an upsurge of personalized learning has caused a surge in the demand for reading materials for Chinese as a foreign language.Reading materials have become more plentiful,and high-quality reading materials with a reasonable arrangement of difficulty have provided readers with different levels of reading comprehension and level.Quickly grasp the convenience of a language.Therefore,this dissertation conducts an automatic assessment of readability of reading materials for foreigners.Based on the existing readability assessment studies,this thesis comprehensively considers the factors that influence the difficulty of reading materials from the perspective of Chinese ontology.The natural language processing technology and database management technology were used to extract the features of reading materials for foreigners,and the readability of the texts was evaluated with statistical machine learning methods.Mainly adopting computer text analysis tools to process the natural language processing of six representative sets of textbooks and reading materials for middle-to-high-level Chinese textbooks and reading materials for Chinese as a foreign language,extracting words,semantics,and texts that affect the readability of the text to build a readability assessment model.The main innovations of this thesis are as follows:(1)Considering the influencing factors of the readability of textbook texts from the perspective of Chinese ontology,it mainly selects and extracts features in multiple dimensions such as words,semantics,and texts.In the extraction of word grade features,the duplicate data is processed in consideration of the influence of the "Isomorphic Multi-Level Word" factor and the word frequency is determined in conjunction with the "HSK Vocabulary Level Standard Outline".The total number of features reached 48,and a more comprehensive extraction of the rules of readability of the layout of the teaching materials of the experts.In addition,the readability assessment model was constructed from the four dimensions of words,semantics,texts,and the overall dimensions.Specific dimensions were specifically analyzed,and readability of reading materials was evaluated in multiple dimensions.(2)The SVM algorithm is used for classification and regression modeling respectively.In the regression model,the problem of setting readability value labels is resolved through a uniform segmentation method.Compared with expert evaluation methods,the cost is lower,and it can also effectively avoid the local features existing in the model overfitting questionnaire samples.In the currently constructed model,the experimental results on an independent test set show that the classification algorithm results are better than the uniform label regression method.However,the uniform assignment of difficulty values in the regression method makes the article readability more detailed and more accurate.In the future,with the continuous adaptation and development of textbooks it will still be a feasible method for assessing readability.
Keywords/Search Tags:Chinese as a foreign language, Machine learning, Regression Model, Readability assessment
PDF Full Text Request
Related items