Font Size: a A A

Exploration On The Syntactic Features Of English Writing Text Based On Pigai

Posted on:2020-10-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y W LiFull Text:PDF
GTID:2415330596993555Subject:Foreign Language and Literature
Abstract/Summary:PDF Full Text Request
The present study has investigated the intrinsic evaluating mechanism of Pigai on syntactic complexity of English composition written by Chinese college students.It first reviews the previous studies and literature on Automated Essay Scoring systems such as PEG,IEA and E-rater.Compared with those AES systems,we can clearly see the weakness and future improvement for our locally designed AES system—Pigai.Compared with extant studies on Pigai,we point out the research gap on syntactic complexity and syntactic feedback function for Pigai on the improvement for better use in the future English teaching,learning and system improvement in China.In this very study,we explore the intrinsic evaluation mechanism on syntactic complexity criteria to see what kind of syntactic features used in the composition would help students do better in their writing.Two machine learning classifying algorithms are used in the study to facilitate the selection of the significant syntactic features in the student's writing process.The present study analyzes 2,300 English writing compositions written by mainly second-year college students evaluated by Pigai system,which is the most popular locally designed AES system in China.The study uses Second Language Syntactic Complexity analyzer(L2SCA)and R Studio with machine learning classifying algorithms packages(Random Forest and Logistic Regression)to analyze and select the significant syntactic features of high-scored compositions in order to explore the intrinsic evaluation mechanism of Pigai system.Given what is reviewed above,three research questions are proposed to serve the research purpose of the current study as following:(1)The first question: Based on the Pigai's scoring,what is the accuracy of the algorithms(Random Forest,Logistic Regression)for classifying high-scored and low-scored texts via syntactic complexity?(2)The second question: With the classifying models being established,how well the algorithm models fit the data?(3)The third research question: What are the significant features of the high-scored writing texts at the syntactic level?The major findings of the current study are as follows:1)Based on the 14 syntactic complexity indicators,the classification of high-scored compositions is highly predictive and well-performed,and the accuracy rates of Random Forest and Logistic Regression are 84.9% and 93.4%,respectively;2)The classifying models established by the Random Forest and Logistic Regression both have ideal fit,and the AUC values of the ROC curves are 0.77 and 0.75,respectively;3)In combination of the sorting order in the Random Forest and the significance level of the Logistic Regression,the top five significant syntactic features of high-scored compositions written by students while the writing process are: MLS(Mean Length of Sentence),C/S(Clause per Sentence),MLC(Mean Length of Clause),VP/T(Verb phrase per T-unit)and CN/C(Complex nominal per Clause).Those 5 indicators can be categorized into 3 syntactic features: Unit Length(MLS,MLC),Sentence Complexity(Clause per Sentence)and Specific Phrase Structure(VP/T,CN/C).That is to say,students that can perform well under Pigai's evaluation are more inclined to increase the use of syntactic features such as unit length,sentence complexity and specific phrase structure during the writing process.The research hopes to shed light on English writing teaching,learning and improvement of automatic essay scoring system.
Keywords/Search Tags:Pigai, English composition, L2 syntactic complexity, Syntactic feature Exploration
PDF Full Text Request
Related items