Research On Text Information Retrieve Model

Posted on:2008-04-21

Degree:Master

Type:Thesis

Country:China

Candidate:G Huang

Full Text:PDF

GTID:2178360215466145

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

People are going into information age with the development of Internet. At the same time, the contradiction between large amount of digital information and the information people really need becomes more and more incisive. Nowadays, how to retrieve the requisite information quickly and precisely turns into the hot spot in information field. Some organizations and companies both here and abroad have developed many kinds of information retrieval models so far, together with corresponding searching engines, which to some extent help users navigate among Internet to get needed information. However, there exist some limitations in those models such as the returning of much junk information, losing of important information, so on so forth.Firstly, this article beginning with information retrieve model, it sets forth the base theories and the principles of algorithms of traditional information retrieve model. And then the thesis emphasizes to present some related knowledge on domain ontology and three kinds of computing models based on the domain ontology. We have analyzed three semantic similarity computation models which are based on distance, content and attribute respectively. It is known that those three models quantize the semantic similarity among concepts from three different points of view: (1) The distance-based model is simple and intuitional, but it depends deeply on the concept hierarchical network established before, whose structure will affect the computation of semantic similarity directly. (2) The content-based model has more persuasion in theory because it makes full use of the knowledge of information theory and probability and statistics. However, this model cannot differentiate the value of semantic similarity meticulously among each concept in the hierarchical network. (3) The attribute-base model can simulate human's behavior of recognizing and distinguishing well, but ask for a detailed and comprehensive description on every attribute of the objects. So, aimed at the advantages and disadvantages of these models, as well as the characteristic of domain ontology, we put forward an improved domain ontology-based semantic similarity computing model, which makes a concept's information content and attribute as two decision factors on the ground of the distance-based computing model.Based on above theory, through the comparison between statistical-based and ontology-based information retrieval model, we know that the two reinforce each other in some degree:(1) The statistical-based information model emphasizes the statistical information of the key words, but at the same time ignores the semantic information between the key words. (2) The ontology-based information retrieval model behaves on the contrary. In this paper, it puts forward a hybrid information retrieval model by taking the advantage of the two models mentioned above. We construct a type of prototype of advance information retrieve system based on above model, and give some explanations about functions and principles of several parts of the prototype system.Finally, a test system called C/S mode based text information retrieval system is designed using JSP technology. We have developed a domain ontology of the former three chapters in Data Structure through Protege, and used Apache Tomcat 5.0 as Web Sever, Microsoft Office XP Access Professional as database to built the experiment environment. It is proved that compared with the statistical-based and ontology-based models with the results of several experiments, the model we put forward has been obviously improved in the ratio of completeness and correctness.

Keywords/Search Tags:

IR, VSM, Ontology, Hybrid

PDF Full Text Request

Related items

1	Research And Implementation Of Recommendation System Based On Bilingual Ontology Mapping Of Books
2	Study On The Theory And Practice Of Ontology And Ontology-based Agricultural Document Retrieval System--Floricultural Ontology Modeling
3	Research On Product Design Ontology Management And Application Based On Semantics
4	An Approach For Measuring And Comparing Structural Semantics Of Ontologies Based On Graph Derivation
5	The Research Of Analysis And Storage On Large-scale Semantic Data
6	Imprecise Ontology Merging Research
7	Research On Some Problems Of Formal Ontology Engineering
8	Research On Key Technologies For Ontology Management
9	Research Of Ontology Approaches And Their Applications In Spatio-temporal Reasoning
10	Research On Several Key Technology Of Ontology Engineering