Font Size: a A A

The Research And Design Personalized News Search Engine

Posted on:2013-07-07Degree:MasterType:Thesis
Country:ChinaCandidate:J C ZhuFull Text:PDF
GTID:2248330374485464Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the developments of computer technologies, the information age has arrived.How to help users to earn their required information quickly and accurately is a problemwhich must be solved as soon as possible. Search engine arises at this moment. Verticalsearch engine is an important part of search engine. Vertical search engine can helpusers obtain information they need more efficiently and accurately than universal searchengine does. In addition, personalization technology can provide different search resultsto different users. It makes great improvements to users’ search satisfaction. At present,most search engines only take users’ search words into consideration when executingsearch process. They do not care about users’ interest fields. So, most of the retrievalresults from search engines are unrelated with users’ real needs. Therefore, study ofpersonalization of vertical search engine is an effective method to solve retrieval results’relevance bottleneck.This thesis does the researches on Web crawler module, index module and searchmodule which starts from basic concept, principle and working structure of searchengine. Then it puts forward a valid personalization solution which suitable for verticalsearch engine combined with users’ interest model. At last, this thesis realizes a newssearch engine named PNSE which is an example of personalized vertical search engine.Web crawler module collects data from major portal sites, and filters pages by thematicrelevance. Index module introduces document classification technology, which treatsdifferently types of document on the basis of ensuring index efficiency. Search modulecombines user interest model and document classification technology, which canimprove the relevance between search results and users’ real needs.The innovation points of this thesis mainly include three aspects. Firstly, it putsforward a profession crawler solution which can be used in personalized vertical searchtechnology. Vertical search engine does not take page relativity filter rules intoconsideration, this could cause lots of noise pages generated. This thesis introducesrelativity filter rules to the crawler, which could improve the efficiency of data collection by reducing the crawling of irrelevant Web pages. Secondly, this thesisprovides a document classification method which is suitable for vertical search engine.Current vertical search engines use URLs or column’s titles to classify Web documents,which often causes the lack and dispersed of categories. This thesis imports featuresselect functions and document classification methods which used in common searchengine. They are used in vertical search engine after being optimized and improved.Thirdly, this thesis presents a user interests modeling method with the help of relevancefeedback technologies. This makes the search engine have stronger abilities ofpersonalization.
Keywords/Search Tags:Vertical search, Personalization, User interest model, Features select, Document classification
PDF Full Text Request
Related items