| Chinese automatic word segmentation system is a computer application system ,which make use of computer to conduct the word segmentation and identification for Chinese articles. The system mainly includes automatic word segmentation module, ambiguous word segmentation module and special word identification module, and the quality, value and application level of the system are determined by all these modules which depend on each other.Chinese automatic word segmentation method is made up of mechanical word segmentation method and non-mechanical word segmentation. Maximum positive match method , Maximum negative match method and word by word travel method is the basal mechanical word segmentation, and other eight types, which is not true mechanical word segmentation, are only take some skills on the basal word segmentation method. The specialist system method is a word segmentation method based on the regularity, while the nerve fiber network method is a computer Chinese word segmentation technology based on the fundamental of the artifical nerve network.According to the research and system design about automatic word segmentation method at home and abroad, this paper puts forward the conception of the academic model CWSM:M(F,W,T,K) for the automatic word segmentation system, which includes mechanical word segmentation method, word segmentation dictionary, Chinese text and repository. Further more this paper introduces the evaluation standard of the automatic word segmentation.The ambiguous meaning emerging in the process of word segmentation is mainly made up of special ambiguous meaning caused by computer word segmentation, ambiguous duality meaning caused by natural language and ambiguous meaning caused by the magnitude of .word segmentation library. The ambiguous fields can be classified into three aspects. From the result ofsegmentation, it can be sorted to true ambiguous meaning and false ambiguous meaning. From the acknowledge hiberarchy needed by the segmentation of the ambiguous field, it can be sorted to ambiguity of syntax, the ambiguity of language's meaning and the ambiguity of language's application. From the structure of the ambiguous field it can be sorted to intersection field and multi-meanings field. The method of segmentation' of ambiguous intersection field includes statistic method and part of speech method. The treatment of ambiguous multi-fields can be conducted from three different aspects: ambiguity of syntax, the ambiguity of language's meaning and the ambiguity of language's application.In Chinese information system, the use of noun is the most frequent. Especially, it is very difficult to deal with special noun in Chinese automatic word segmentation. First, this paper analyses the character of the surname and firstname in Chinese name, and bring forward the automatic identification technology of Chinese name. Second, this paper takes the repository and rule library to identify the placename by the deduce mechanism. Thirdly, this paper uses college name as a example of organization name identification. According to the characters of grammar, the meaning of language and organization, it brings forward the rule of college name identification. In additional, this paper analyses the relation between the organization name, people name and placename in brief. |