Principle Based On Compressed Full-text Search Method

Posted on:2005-01-21

Degree:Master

Type:Thesis

Country:China

Candidate:X J Lian

Full Text:PDF

GTID:2208360125467130

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

New challenger to traditional information retrieval (IR) occurs with the great increment of text information recently. Most of the Information we can get saves in all kinds of documents. In the process of IR, How to compare the similarity among the documents becomes one of the most crucial factors. The traditional method to calculate similarity between texts is to use cosine coefficient in the vector space.We summarize another method using the theory of data compression to calculate compression ratio to express the similarity between texts on the base of previous research. It has some advantages over the other method that is based on the statistic. This method can incarnate the latent characteristic of statistic. And it is independent of key words.In addition, we cluster associated documents. Cluster-based retrieval has as its foundation the cluster hypothesis, which states that closely associated documents tend to be relevant to the same requests. Clustering picks out closely associated documents and groups them together into one cluster. And we use Genetic Algorithm to search associated documents. The result shows us the method' s rationality and va] idity.

Keywords/Search Tags:

Text Information Retrieval, Data Compression, Similarity, Cluster-based Retrieval, Genetic Algorithm

PDF Full Text Request

Related items

1	Based On Data Compression, Information Retrieval Technology
2	A Stable Information Retrieval Algorithm And Its Application In Peer To Peer Network
3	Study On Information Retrieval Of Quality Internet Public Opinion Monitoring System
4	Research On Large Complex Equipment Text Repair Case Retrieval Algorithm Based On Ontology
5	Research And Application Of Full Text Retrieval Based On Hadoop
6	Construction Of Kernels For Text Similarity Detection And Application In Distributed Information Retrieval
7	Text Retrieval Based On Real-time Twitter Streaming
8	Research And Implementation Of Full-text Retrieval Combining Word Matching And Context Interaction
9	A Restricted Domain Text Retrieval System
10	Research On The Full Text Retrieval In Scientific Literature Sharing Platform