| Word sense disambiguation (WSD) refers to computationally identify the specificmeaning of an ambiguous word according to the context. It has been a crucial subject inthe field of natural language processing and has the direct bearing on the efficiency oflanguage processing application system, such as, information retrieval, machinetranslation, text categorization, speech recognition, etc. The studies of the WSD havemade the remarkable progress in the past. Nevertheless, from the perspective of language,the WSD researches mainly focus on nouns, verbs and modal verbs. The study of theWSD of English prepositions has not been touched much and the research method isrelatively simple. In the aspect of the WSD of prepositions, there is large room forimprovement. Therefore, this paper has investigated the WSD of English prepositions bythe theory and approach of formal concept analysis.English prepositions are the functional words which usually express the relationsbetween the word and the word, the word and the sentence. In general, they are used at ahigh frequency. They play a significant role in writing and speaking. However, mostprepositions are characteristic of polysemous and the senses are closely related. It is easyto cause ambiguity in the human communication and natural language processing.Therefore, it is necessary to disambiguate the meanings of prepositions, which has theimportant significance for prepositions and WSD researches.This paper has built a1.5million words corpus to study the WSD of over based onthe theory of formal concept analysis. First, the WSD model of over has been constructedon the basis of150samples in the training set. The accuracy of the WSD of over reaches93%. However, in order to overcome the problem of data sparseness,150samples in thetraining set and300samples in two testing sets are put together to reconstruct the WSDmodel of over. The accuracy of the model is97.55%. The accuracies of two models exceed90%; therefore, it proves that formal concept analysis has the effectiveness on thedisambiguation of over. Compared to the accuracies of two WSD models, the secondmodel has the higher accuracy. It shows that when the data set contain the more samples,the more semantic patterns would be extracted from the WSD model. The accuracy is relatively high.Second, the rules to the disambiguation of over are respectively extracted on thegrounds of two WSD models. These roles are embedded in the fundamental knowledgebehind preposition over and play a significant role in the disambiguation of over. Bytesting these rules, the accuracies of two models are96.33%and97.77%respectively. Itproves that those rules can disambiguate the senses of over. The rules are more objectiveand holistic duo to more samples.Third, in consideration of the higher accuracy of the second model, this paper furtherdiscusses the interaction between contextual features on the basis of the second model.The major findings are discovered as follows:1) from the perspective of layer distribution,it takes on the gradient distribution, from semantic features to both semantic and syntacticfeatures to syntactic features;2) from the view of point of the influence on the senses ofover, the senses of over are realized by only semantic features or both semantic featuresand semantic features;3) from the viewpoint of the extents and intents of the senses ofover, in general, semantic features are the extents of the senses of over; both syntacticfeatures and semantic features are the intents of the senses of over.The construction of the WSD model of over and the interaction of contextual featuresnot only contribute to the deep understanding of the knowledge beside over, but alsoprovide theoretical and practical foundations for the WSD of other semantically complexwords. |