Font Size: a A A

Automated Readability Assessment Of Nmet Reading Passages

Posted on:2019-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:T T LuFull Text:PDF
GTID:2405330566985156Subject:Foreign Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
Readability research began in the early years of the 20 th century.At first,readability of reading texts were measured by formulas developed by linguistic researchers,and most formulas used superficial text features as variables with simple operations.As readability research continued,these formulas received more and more criticism for their lack of linguistic depth.They were also found to be inapplicable for Chinese ESL learners.Early in this century,a few researchers began to combine machine learning to study readability,thanks to the rapid advancement of natural language processing over the past few decades.In search of a viable approach to assess readability in ESL education,this study applies automated readability assessment to 530 NMET reading passages by training 3 classifiers(na?ve Bayes,Support Vector Machine,and k-Nearest Neighbors)with 64 text features grouped in 8 categories,including superficial,OOV,lexical diversity,language model,POS,syntactic,semantic,and discourse features.After using several methods of optimization,the support vector classifier was found to perform best on the test set and was validated with an average overall accuracy of 80.94%,a precision of 81.13%,a recall of 81.13%,and an 1 score of.81.Automated readability assessment thus proves to be a viable approach to assess readability in ESL teaching and learning.This study also investigates the significance of individual feature groups perceived by high school English teachers and in automated readability assessment of NMET reading passages.Teachers thought highly of lexical,syntactic,and superficial text features but failed to comprehend discourse and language modeling based features properly,while POS and semantic features only received medium attention.As for feature groups defined in this study,syntactic and language modeling based features were the most important feature groups,followed by POS features,OOV features,and lexical diversity features.Superficial text features alone were not powerful predictors of readability.Semantic and discourse features are the deficiency of the current model,but they have played a part in several studies of automated readability assessment.This study also reveals shortcomings in the size of the data set and the process of labeling.
Keywords/Search Tags:automated readability assessment, NMET reading comprehension, natural language processing, machine learning
PDF Full Text Request
Related items