Research On Web Information Extraction Applied To Chinese Name Search Engine

Posted on:2007-01-07

Degree:Master

Type:Thesis

Country:China

Candidate:Y Wang

Full Text:PDF

GTID:2178360182993958

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Web information extraction is the process of extracting information needed from Web documents. This paper researched information extraction and applied to subject-oriented search engine. The subject of this paper is Chinese name.The paper researched Web information extraction technology to Web information of Chinese name. The paper designed the information extraction model and tested it. The paper extracted people' s attributes (birthday, occupation, place and organization) from the Web documents.The paper explained system flow, the methods of submodule in the system flow and concrete technology used in the module of information extraction in detail. The paper used different pattern extraction algorithm to different class of Web documents. The paper used knowledge engineering approach to the class of person introduction and built pattern repository manually. The paper used automatic training approach to the class of person action. A new algorithm was proposed to extract pattern from training set automatically. At last the paper experimented on Web pages about somebody, and the experimental results proved that the information extraction model can extract right information relatively and satisfied the requirements.

Keywords/Search Tags:

information extraction, search engine, pattern match

PDF Full Text Request

Related items

1	On The Research And Development Of A Video Search Engine For Chinese Web
2	Study On The Key Aglorithm Of Verticle Search Engine In Silk Area
3	Intelligent Search Engine Based On Thematic Information Technology Research,
4	Design A Enterprise Information Search Engine And Research Its Key Technology
5	Research On Web Information Extraction Technology In Vertical Search Engine
6	The Key Technologies Of Agriculture Search Engine Research
7	Design And Implementation Of News-Collecting System
8	Design And Implementation Of Core Word Extraction System In Search Engine
9	Research And Implementation Of Page Object Extraction Model For Vectical Search Engine
10	Based On Solaris In English And Chinese Search Engine Design And Realization