Font Size: a A A

A Study On Computer Processing Of Errors In Chinese Interlanguage Corpus

Posted on:2009-01-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:J WangFull Text:PDF
GTID:1115360302973188Subject:Linguistics and Applied Linguistics
Abstract/Summary:PDF Full Text Request
The purpose of this dissertation is to find methods to make computer assist humans in errors processing in Chinese interlanguage. There are four ways of assisting: error automatic detection and prompt, error interactive detection and prompt, error automatic tagging, automatic management and retrieval of error tagging. The most intelligent one, error automatic detection and prompt, is the focus of our study. The first step is to detect errors automatically. After that, computer will give some hints, which includes: direct revision, revision suggestions, reasons for errors, or questions. Until now, relatively speaking, the more matured such technology is only in automatic English spell checking. Chinese text proofreading system just comes to the practical level. So much of the related field is still waiting to be explored. So far, as the author observed, the study on computer processing of errors in Chinese interlanguage corpus has not been seen in the published papers.This dissertation could be divided into three parts:First, Computer-oriented error analysis. Combined with the errors category system of HSK Essay Corpus and that of Analysis of Errors of Foreign Students in Learning Chinese Grammar, we analyzed the feasibility of error automatic detection and prompt according to the capability of computer and the required knowledge when to use computer to process natural languages.Second, Experiments on four special sentence types. We selected "Ba" sentences, "Bi" sentences, "You"sentences and "Bei" sentences as the cut-in points according to standards such as high error ratio and high formalization degree. We used the rule-based method to carry on the experiments of error automatic detection and prompt. Compared the result with that of error manual tagging, we can draw the conclusion that computer could play a good role in aiding to detect errors in those special sentence types.Third, Proposal to improve the interlanguage corpus tagging method. With the use of the famouse edit distance algorithom and taking the Chinese "word" as the unit, computer can follow the edit path to work out the foundmental operations required to correct the original sentence. In addition, on the basis of the results of the automatic comparition, we let computer take part in the classification of errors. This proposal can improve the interlanguage corpus tagging method, and shows a good cooperation between human and computer.Compared with the former related researches, Our main innovations are:(1) emphasis pointWith the aspect of error analysis, the related researches before are human-oriented, aiming to figure out a teaching tactics which makes students to reduce or avoid errors in their learning. Our research is computer-oriented, hoping to let computer do the detection and prompt jobs automatically.With the aspect of Chinese interlanguage corpus building, linguists are used to tagging errors manually. We are discussing how to tag errors with the help of computer.With the aspect of Chinese text proofreading, the former researches focused on the occasional errors in Chinese native speakers' composition. Whilst, we are concentrating on the regular errors in Chinese non-native speakers' composition.With the aspect of CCAI(Chinese Computer Aided Instruction), previous researches focus mainly on how to impart knowledge to students via computer while the main point in this research is how to give students feedback of the input information by computer.(2) guiding principleWe analyzed the NLP capability and error processing capability of computer and proposed the guiding principle which we should follow.(3) technical methodsWe carried on the experiments of error automatic detection and prompt on four special sentence types. And the results supported the fesibility of the rule-based method.We proposed an auto tagging method based on edit distance algorithom. It can improve the quality and speed of tagging.The significance of this dissertation is in the following two aspects:Theoretically, it argues that CALL(Computer Assisted Language Learning) method could go beyond the simple multimedia teaching, but it seems impossible to process all kinds of errors relying completely on computer-based processing method. We realistically analyzed the computer's ability in this domain. In addition, computer-oriented error analysis proposed a new attention angle for the Chinese grammar research and the teaching Chinese as a foreign language research.Practically, the methods that this paper proposed could help to reduce burdens in teaching; enhance the tagging speed and quality in Chinese interlanguage corpus tagging; assist in Chinese leaning; advance the degree of automatization in composition scoring system.
Keywords/Search Tags:interlanguage, error, error analysis, error automatic detection and prompt, edit distance
PDF Full Text Request
Related items