| In order to let the computer have the ability of processing and even comprehending natural language, people have developed lots of natural language semantic analyzing theories. In the domain of Chinese language processing, most of the semantic analyzing theories base on Chinese word segmentation. Nowadays, a great many Chinese word segmentation methods are developed, but no matter which method is used, segmentation ambiguity can not be avoided. One of the mainly factors leading to Chinese word segmenting ambiguity is that many sentences contains proper nouns. The proper nouns usually derives new words frequently, easy to change or disappear, and can not be formed following particular rules, all of these cause the proper nouns hard to be recognized in the sentence and make Chinese word segmentation difficult.This paper concentrates on resolving proper nouns recognition problem about Chinese word segmentation. At first, the paper introduces the progress of computer natural language processing, especially the Chinese language processing, and the methods used for proper nouns recognition. After that, it discusses the proper nouns recognition algorithms designed for our proper nouns recognition system in detail: dividing all proper nouns into two categories, one category is named Stable Proper Nouns, which contains the proper nouns that will exist in a long term and a wide spread; and the other category is called Unstable Proper Nouns, proper nouns in this category often derives new proper nouns frequently and can not be formed following particular rules, the mainly proper nouns in this category are Chinese names. After dividing proper nouns in two categories, this paper discusses different algorithms for these two categories: for the first category we use proper nouns database to recognize it, and for the second category we use the recognizing method base on Native Bayesian Classification Algorithm. When finishing discussing the recognizing algorithms, paper introduce our proper nouns recognition system which implements the recognizing algorithms just discussed. The introduction follow the steps: first discuss the total recognition process of the system, then discuss the static architecture of the system, including the packages, classes and the interface between this system and its parent system-Chinese language processing system, and last introduce the proper nouns recognizing flow of the system, mainly including the caller-callee relationship between... |