Font Size: a A A

Feature Extraction And Selection For Word Sense Disambiguation Of Secondary English Modal Verb Might And Could

Posted on:2023-08-15Degree:MasterType:Thesis
Country:ChinaCandidate:R J LiuFull Text:PDF
GTID:2555306848463624Subject:English Language and Literature
Abstract/Summary:PDF Full Text Request
The meaning of most words in natural language is vague or uncertain,which needs to be determined in a certain context.English modal verbs have complicated senses.In recent years,some scholars studied word sense disambiguation(WSD)of English modal verbs,which mainly focused on the sense disambiguation and rule extraction.However,the study on feature selection of the secondary English modal verb is not enough.Based on semantic theory of English modality,this thesis studies feature extraction and feature selection of the secondary English modal verbs might and could by filter method and APOSD method of mathematics and information science.With a 5-million-word multigenre corpus of English,the senses of might are classified into three meanings(epistemic possibility,root possibility,root permission)and the senses of could are classified into four meanings(epistemic possibility,root possibility,root permission,root ability).Then 300 sample sentences of might and 400 sample sentences of could are selected as objects,and50 semantic co-occurrence features of might and 51 semantic co-occurrence features of could in the context are extracted as attributes,constructing the formal context to express co-occurrence relationship between object and attribute.Based on the formal contexts of might and could,the rules are extracted by APOSD method and simple class exclusive features and composite class exclusive features are obtained by filter method,to disambiguate senses and select optimal features of might and could.The research found that: First,by Filter-APOSD method,the accuracy for WSD of might is 99.3%,and the accuracy for WSD of could is 98%.Second,50 features(12semantic features,31 syntactic features,2 pragmatic features,3 topic features and 2 genre features)of might and 51 features(16 semantic features,28 syntactic features,2 pragmatic features,3 topic features and 2 genre features)of could have been extracted.Third,by FilterAPOSD method,optimal feature sets of might and could are obtained.And the importance degree of these optimal features has been ranked according to the number of objects recognized by optimal features.In the group of simple class exclusive feature,syntactic feature has the highest degree of importance.In the group of composite class exclusive feature,semantic feature has the highest degree of importance.Fourth,after deleting different features in formal contexts,syntactic features have the most restriction degree on WSD of might and could according to variation of WSD accuracy.Fifth,might and could both have the strong classification feature: present tense.The research findings provide knowledge basis,experimental support for the further studies on the secondary English modal verb,and reference for natural language processing(NLP)and machine translation.
Keywords/Search Tags:secondary English modal verb might and could, feature selection, filter method, attribute partial order structure diagram
PDF Full Text Request
Related items