Font Size: a A A

Based On Solaris In English And Chinese Search Engine Design And Realization

Posted on:2007-03-30Degree:MasterType:Thesis
Country:ChinaCandidate:W H FengFull Text:PDF
GTID:2208360185456467Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With rising of the Internet's knowledge on the way of geometric series, the Search Engine, which is looked as a tool that users get information with, its function had been regarded. At present, an excellent Search Engine can increase the number of the Web site's accessing largely and has been the Web site's appearance. At the same time, with the increasing popularity of Chinese on the Internet, how to search out Chinese information swiftly and effectively soon becomes a hot issue.The search engine is generally made up by Spider,index storehouse,searching device and user interface. Spider downloads pages from Web; parser analyses the pages in order to set up index; the index lists the file in a way which is easy to be searched and stores it in the index database; the searching device realizes users calculation of inquire keyword and match degree of goal files; user interface provides users a web page which can input inquiring request and customize inquiring results, and returns the results to browser after being formatted.This project's main task was to improve a search engine for English and Chinese words which is based on Solaris OS. This search engine has possessed functions of English word query, English AND query, English OR query, and Chinese separate word query. Our main job was to add AND and OR search functions for Chinese, while at the same time kept the general designs intact. Through analysis of the source codes, we added the Chinese Boolean operation to its function and improved the interface between user's browser and background search engine at web front for query optimization. Having studied the overall framework of the existent search engine system, we adopted the idea of trade-off algorithm to process the three-level index structure of Chinese character for the implementation of Chinese Boolean operation, according to the different expression between Chinese and English. This can greatly enhance the speed and veracity of Chinese query. Besides, we also corrected some errors and some flaws of the original program, and the improvement of user interface is introduced in detail in this paper. Meanwhile some future work is also discussed. The first chapter introduced the conception, sorts, common searching process of search engine, an also its developing trend. The second chapter discussed the design and...
Keywords/Search Tags:search engine, index, filter, distributed, match
PDF Full Text Request
Related items