The Construction Of Chinese Morpheme Words Knowledge Base And Its Application In Understanding Unregistered Words

Posted on:2018-07-20

Degree:Master

Type:Thesis

Country:China

Candidate:J J Qu

Full Text:PDF

GTID:2355330518491081

Subject:Linguistics and Applied Linguistics

Abstract/Summary:

PDF Full Text Request

Chinese vocabulary system is constantly developing and changing,so the number of unknown words is infinite. However,morpheme is relatively limited in quantity and stable in semantic function as the basic components of word formation.Therefore, in natural language processing, morpheme can be used as basic resources to obtain word formation knowledge, used to recognize and understand unknown words. However,most word formation knowledge bases are only used for statistic of word formation rules. There is a disjoint phenomenon between word formation knowledge bases and application research of unknown words.In order to get more conductive word formation knowledge for machine computing, we take the two-character words common to Modern Chinese Dictionary(5th edition), HowNet(2009 edition) and Cilin(extended edition) to construct the Chinese Morpheme-word Knowledge Base. Each word sense forms a record, a total of 39102 records, and each record with 19 properties, where particle degree of the smallest is two kind of morpheme meaning based on HowNet and Modern Chinese Dictionary. In this paper, we use Chinese Morpheme-word Knowledge Base, mainly to do the following aspects of research:First, we statistically analyze word formation knowledge of nouns, verbs and adjectives in Chinese Morpheme-word Knowledge Base from five aspects, which are semantic category, combination of morpheme’s parts of speech, combination of morpheme’s semantic category, grammatical structure type, the type of relationship between word meaning and morpheme meaning, then statistically analyze combination of morpheme’s parts of speech,combination of morpheme’s semantic category, the type of relationship between word meaning and morpheme meaning by grammatical structure type in each parts of speech.Second,based on word formation knowledge in Chinese Morpheme-word Knowledge Base, we use the phased algorithm to automatically predict word formation knowledge of unknown words. Through the combination of morpheme meaning or the combination of morpheme’s semantic category, we first predict the knowledge of semantic level, then determine the corresponding morpheme, and finally get the knowledge of word formation of unknown words. The algorithm is simple, intuitive and reasonable. The experimental criteria is seven predictive content are all correct, which are the first morpheme’s parts of speech,the first morpheme’s semantic category, the first morpheme’s meaning, the last morpheme’s parts of speech,the last morpheme’s semantic category,the last morpheme’s meaning,grammatical structure type. The experimental results show that the prediction accuracy is 62.32%and the recall rate is 61.71%.Third, based on the prediction of word formation of unknown words, we use word similarity based on word formation to find word with the greatest similarity to the unknown word in Chinese Morpheme-word Knowledge Base, in order to achieve semantic understanding of unknown words. According to the experimental evaluation criteria, when the threshold is 0.8, the efficiency of semantic comprehension of unknown words is 51.69%.To sum up, the Chinese Morpheme-word Knowledge Base we construct achieve good results in the application research of unknown words and has important value for natural language processing.

Keywords/Search Tags:

Language knowledge base, HowNet, Unknown words, Sense guessing, Word similarity

PDF Full Text Request

Related items

1	The Grammatical Function Of The Unknown Word Guessing
2	Research On Lexical Level Knowledge Mining Based On Corpus
3	Influnce Of Background Knowledge And Unknown Word Rate On EFL Students' Listening Comprehension
4	A Study On The Influence Of Semantic Transparency And Context On Foreign Students’ Guessing Words And The Memory Retention Effect Of Guessing Words
5	A Study On The Knowledge And Order Of The Knowledge Used In Word-Guessing In The Course Of Chinese Reading By Korean Students
6	An Investigation On The Use Of Word Guessing Strategies In Chinese Reading For Middle And Advanced Nigerian Students
7	A Study On The Using Of Word-guessing Skills For Central Asian Students In Chinese Reading
8	The Unknown Words From Double Word Frequency Identification Study
9	Research On Knowledge Base Construction For Tibetan Function Words
10	A Research And Practice On Word Guessing Skills For Elementary Reading Course