Font Size: a A A

Automated Assessment On The Relativity Of Implicit Discourse Relations Of Writings Of College English Test Band-4

Posted on:2018-03-02Degree:MasterType:Thesis
Country:ChinaCandidate:B XiangFull Text:PDF
GTID:2335330536957577Subject:Foreign Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
A rational automated essay scoring system is supposed to cover two aspects:language quality assessment and content qualtiy assessment.What makes the complicated content quality assessment different from the simple language quality assessment is that the former needs to dissect the organic connections among language chunks(words,phrases,clauses).The scoring rubric of writings of college English test band-4 is one that supplements a focus on assessing content quality with assessing language qulity.Put another words,content quality,namely implicit discourse relations,is the key benchmark of a good or bad writing.It's a reason why the research topic is chosen.The automated essay scoring system is conceived as followups: calculate the relativity between the to-be-assessed writing and the rated writing and score the to-be-assessed writing referring to the rated data.This idea reveals that judging the relativity of implicit discourse relations,namely the research argument,plays a core role in constructing a automated essay scoring system.There are two major models available to analyze implicit discourse relations:vector space model and latent semantic analysis model separately.The former regards all words except for stop words as feature vectors and represent the text with those feature vectors.But this method fails to fix problems of polysemes and synonyms.LSA begins with words which are the minimum constituent of the discourse,and it's supported by issues of similarity and generalization in language acquisition and knowledge acquisition philosophically.Those issues are also known as Plato's confusion,namely how can we learn so much knowledge by virtue of limited information? The theoretical foundation of the research is the latter.LSA claims that words don't exist in isolation,they are closely correlated with each other through a latent semantic space instead.But none of all the words are linked with the latent semantic space directly.Thus we need to extract the feature items relevant with the latent semantic space directly.Feature items extraction falls into two steps: 1.Extracting the feature items roughly,namely preprocessing the rext,including finishing the normalization of uper-case and lower-case letters,removing stop words and stemmization;2.Calling svd function with the mathematical software matlab to extract feature items precisely with the following specific moves: Building a matrix composed of the roughly extracted feature items by texts firstly;Decomposing the original matrix into three small matrixes with svd function secondly;Selecting data of top k column in each small matrix according to specific demands thirdly;Reconstructing those three small matrixes with k column into a new matrix with the reverse function of svd in the end.The reconstructed matrix shields the noisy information and preserves the essential information of raw data,realizing the feature items extraction indeed.It is how computers simulate human beings to identify similarity and generalization of knowledge.It's also the core theoretical foundation of the thesis.In this thesis,a classical concise case is employed to demonstrate the significance of latent semantic analysis in assessing the relativity of implicit discourse relations at first.Then writings of cet-4 written by graduates of non-English major of HubeiUniversity of Technology have been set as empirical data to help us to do an in-depth analysis.And the conclusion has been drawn as: there is a connection between the correlation coefficient of implicit discourse relations and human-rated results.
Keywords/Search Tags:writings of College English Test band-4, implicit discourse relations, relativity, latent semantic analysis, singular value decomposition
PDF Full Text Request
Related items