Font Size: a A A

Fine-grain Word Sense Disambiguation Of English Modal Verb Might And Interaction Between Contextual Features

Posted on:2015-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:W ShenFull Text:PDF
GTID:2285330452954791Subject:English Language and Literature
Abstract/Summary:PDF Full Text Request
Lexical ambiguity refers to the indetermination of the meaning of a word. Word sensedisambiguation here is the process of automatically identifying the meaning of anambiguous word under context with the help of computer. This thesis focuses on thefine-grain word sense disambiguation which refers to the process of identifying themeaning of an ambiguous word under context after processing the complex contextualfeatures with the help of computer. The contextual features, in this thesis, include semanticfeatures and syntactic features. This thesis has investigated the fine-grain word sensedisambiguation of English modal verb might and the interaction between its contextualfeatures by using formal concept analysis as a theory and an approach.The sense division of might in this thesis bases on both several English dictionariesand Coates (1983)’ The Semantics of the Modal Auxiliaries. The meaning of might isdivided into13classifications in a corpus of3,600,000words according to the principle offine grain.56semantic features and11syntactic features are extracted as context for wordsense disambiguation. First, a fine-grain word sense disambiguation model is set upaccording to a formal context of100objects in the training set. The accuracy of thetraining set itself is92%. The disambiguation accuracy of this model is obtained as71.5%by examining it with two testing sets. Second, based on a large formal context including300objects from both training set and testing sets, a fine-grain word sense disambiguationmodel is reconstructed to soften the impact of data sparseness. The accuracy of this modelis76%±0.1472%by five-fold cross validation. The accuracies of both models are over70%, which proves the approach of formal concept analysis works in fine-grain wordsense disambiguation. By testing all the300samples in the formal context with rulesextracted from the second model, an accuracy of95.33%is obtained. It proves that thepatterns of extracted rules are more various under the big formal context of300samples,which can disambiguate might’s senses efficiently. The analysis of the interaction betweensemantic features and syntactic features is based on the second disambiguation model,since its accuracy is higher. The major findings are as follows:1) From the perspective ofthe layer distribution in the attribute positive sequence diagram and the semantic classification of might, the extension of might’s sense is semantic features in upper layers,while the intension is a combination of syntactic features and semantic features in lowerlayers;2) two patterns of the interactions between semantic and syntactic features areproposed as separation and coexistence; through analyzing the extracted rules, one can getthat semantic and syntactic features tend to be separate, however, their relation is notdeterministic in various semantic classifications of might.As an attempt, the construction of the fine-grain word sense disambiguation of mightand the analysis of interactions between semantic features and syntactic features not onlyenrich the correlation studies of English model verbs, but also provide references for thestudies of other words whose meaning is complicated and ambiguous.
Keywords/Search Tags:fine-grain word sense disambiguation, formal concept analysis, Englishmodel verb might, interaction between contextual features
PDF Full Text Request
Related items