Modern Chinese Login Derivative Analysis And Recognition At The End Of The Study

Posted on:2013-01-21

Degree:Master

Type:Thesis

Country:China

Candidate:Q Wang

Full Text:PDF

GTID:2245330395453248

Subject:Linguistics and Applied Linguistics

Abstract/Summary:

PDF Full Text Request

In the field of Chinese information process, the basic task of Chinese language analysis is automatic segmentation of Chinese word. There are two problems in automatic segmentation at current state:the recognition of unknown words and the segmentation of ambiguous phrase. The recognition of unknown words is one of important elements in correctly segmenting Chinese word and yet a difficult task to accomplish.In recent years, in the field of recognition of unknown words, many Chinese scholars have focused on named entity recognition, and many achievements have been accomplished. However, there is paucity of researches on recognition of suffix. Most of the research on suffix recognition has based on individual case, and exhaustive research in a certain range was rarely done. Furthermore, there is no research on word-formation of suffix, which neglects the characteristics of interior structure of the word.Based on the theories of suffix segmentation, this dissertation uses an important tool in Computational Linguistics research--set a standard for suffix segmentation and make a suffix table for the purpose of information process in the light of Quantitative Method of database research.This dissertation analyses every suffix in suffix table on the basis of different categorizations, observes different grammatical and semantic characteristics in Derivatives (Word-Formation) Model, and researches on word-formation models of known words in database adopting quantitative approach by using suffix segmentation standard.In Derivatives (Word-Formation) Model research, this dissertation categorizes suffix according to different meanings and focuses on the characteristics and word-formation model of unknown Derivatives (Word-Formation) words with "ä»¬" and"è€…â€in the light of distribution of unknown words in database.In the research on recognition of unknown words, this dissertation conducts two sets of parallel experiment according to the different word formation capabilities of derivatives, and designs feature templates accordingly. This dissertation also conducts recognition experiment based on Conditional Random Fields and tries to testify the feasibility of the experiment by the outcome. Finally, this dissertation makes an overall conclusion, summarizes the main work for this dissertation and tries to set a plan for further research.

Keywords/Search Tags:

Unknown Words, Suffix, Derivatives (Word-Formation) Model, CRF Model

PDF Full Text Request

Related items

1	A Study On The Semantic Word Formation Of Two - Word Words For The Identification And Understanding Of Ordinary Unsigned Signals
2	A Study Of Zhai Word-family And A Concurrent Discussion On The Formation And Use Of New-Word-Model
3	The Rules Of Word-Formation Of "Zi"-a Suffix In Modern Chinese
4	An Elementary Research Into The Rule Of Word-formation Of "è€…"-A Suffix In Modern Chinese
5	The Generative Mechanism Of Chinese Network Blended Words From The Aspect Of Connectionist Model
6	Statistics, Comparison And Analysis On Quasi-Suffix
7	The Grammatical Function Of The Unknown Word Guessing
8	A Study About The Suffix "Zi" In Qingliang Dialect,in She County,Hebei Province
9	Class Affix "family", "party", "fans", "control" Study
10	The Unknown Words From Double Word Frequency Identification Study