Chinese Language Processing Secondary Extraction And Recognition Of Nouns

Posted on:2007-01-19

Degree:Master

Type:Thesis

Country:China

Candidate:B Zhou

Full Text:PDF

GTID:2208360185956333

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

In order to let the computer have the ability of processing and even comprehending natural language, people have developed lots of natural language semantic analyzing theories. In the domain of Chinese language processing, most of the semantic analyzing theories base on Chinese word segmentation. Nowadays, a great many Chinese word segmentation methods are developed, but no matter which method is used, segmentation ambiguity can not be avoided. One of the mainly factors leading to Chinese word segmenting ambiguity is that many sentences contains proper nouns. The proper nouns usually derives new words frequently, easy to change or disappear, and can not be formed following particular rules, all of these cause the proper nouns hard to be recognized in the sentence and make Chinese word segmentation difficult.This paper concentrates on resolving proper nouns recognition problem about Chinese word segmentation. At first, the paper introduces the progress of computer natural language processing, especially the Chinese language processing, and the methods used for proper nouns recognition. After that, it discusses the proper nouns recognition algorithms designed for our proper nouns recognition system in detail: dividing all proper nouns into two categories, one category is named Stable Proper Nouns, which contains the proper nouns that will exist in a long term and a wide spread; and the other category is called Unstable Proper Nouns, proper nouns in this category often derives new proper nouns frequently and can not be formed following particular rules, the mainly proper nouns in this category are Chinese names. After dividing proper nouns in two categories, this paper discusses different algorithms for these two categories: for the first category we use proper nouns database to recognize it, and for the second category we use the recognizing method base on Native Bayesian Classification Algorithm. When finishing discussing the recognizing algorithms, paper introduce our proper nouns recognition system which implements the recognizing algorithms just discussed. The introduction follow the steps: first discuss the total recognition process of the system, then discuss the static architecture of the system, including the packages, classes and the interface between this system and its parent system-Chinese language processing system, and last introduce the proper nouns recognizing flow of the system, mainly including the caller-callee relationship between...

Keywords/Search Tags:

natural language processing, Chinese word segmentation, proper nouns recognition, native Bayesian classification

PDF Full Text Request

Related items

1	Research On Chinese Word Segmentation Based On Text And Audio
2	Maximum Matching Chinese Word Segmentation Technology Based On Word Classification And Sorting
3	Study On Chinese Word Segmentation Based On Recurrent Neural Network Language Model
4	Research On Chinese Word Segmentation Integrating Pinyin And Tone Information
5	The Methodology And Implementation Of Chinese Natural Language Query In Databases
6	Study On Chinese Named Entity Recognition
7	Research On Chinese Word Segmentation Methods Based On Deep Learning
8	Research On Chinese Word Segmentation Based On Deep Learning
9	A Method Of Proper Nouns Identification Based On Double-level Model Of NSP And CRFs
10	Chinese Proper Names Recognition Based On Pattern Matching