Font Size: a A A

Network-based Professional Search Engine Spiders Search Strategy

Posted on:2008-08-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y FengFull Text:PDF
GTID:2208360215450321Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The design of Spider based on professional search engine includes the design of the system's structure and the design of search strategy.On the professional search engine architecture research and design, most studies focus on the load balance and the search space defined for analysis. So far, it is not found one that focus on the object collecting by Spider----the distribution of network resources.For the relatively small size of the professional search engine, Spider can not have a huge scale. Therefore the distribution of professional resources muse be considered when we design the system architecture.For search strategy, general search engines use the strategy based IP address exhaustive search and based on the traverse of the graph by depth-first or width-first search strategy. However, this strategy would be a waste of the system resources and it is difficult to get our goals and meet professional search engine needs. Currently, the strategy based on a heuristic search was proposed and Web-based structure and content-base mining similarity calculation has been used for professional search engine strategy design.As a research topic combining theory and practice, there are the main work and achievement of the author:1. To study the Spider search algorithm and summarize the main classification of the current search strategy. To compare a few international typical search algorithm and show the results by presenting several charts.2. To take out applying Page-Rank Algorithm added profession fields factors named Comprehensive list of value strategy as our system's strategy.3. Have Designed and implemented a distributed intelligent Spider system .4. Coding for a key sub-module----Source Collecting Module.
Keywords/Search Tags:Search Engine, Intelligent Web Spider, Search Strategy
PDF Full Text Request
Related items