Font Size: a A A

Research On Key Technologies Of Domain Specific Information Retrieval

Posted on:2008-03-26Degree:MasterType:Thesis
Country:ChinaCandidate:K KangFull Text:PDF
GTID:2178360242979465Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the enhancement of the diversification trend of the Web information, providing a general search engine entry for all users will no longer meet the needs of some specific user for their further queries. In this case, domain specific search engines come into being. Domain specific information retrieval provides a substantial support to the research of domain specific search engine. This paper analyzes three important aspects of domain specific information retrieval, which contains access, filtration and retrieval of domain specific information. Besides, a general framework of domain specific information retrieval system is constructed.In terms of the acquisition of domain specific information, it is the common practice of using focused crawlers to obtain the Web pages. However, using meta-search engines seems a better way to acquire more fresh information. But because the component search engines are always general purposed, it is difficult to obtain contents related to specific domain by meta-search engines. A method of query expansion based upon statistical translation model is designed to solve this problem.Text categorization is the foundation of domain text filtration. An improved feature selection is introduced for naive Bayesian text classifiers in this paper. Experiment shows that the improved method has higher recall rates and higher precision rates in text classification used in domain specific information retrieval.Information retrieval based on language model is the major point of this paper. Two expansion frameworks and some related methods are proposed to improve effect of information retrieval. First, hidden Markov model is combined with Bayesian smoothing method. Second, a new algorithm based on dynamic Bayesian network is proposed for obtaining the explanation probabilities between terms. Experiment shows that the improved model performs remarkably better for domain specific information retrieval than some traditional retrieval techniques, and the extended framework has good expansibility.
Keywords/Search Tags:domain specific information retrieval, text categorization, language model
PDF Full Text Request
Related items