Font Size: a A A

Integrating DB-IR Using Multi-Indexes

Posted on:2012-10-07Degree:MasterType:Thesis
Country:ChinaCandidate:T QiFull Text:PDF
GTID:2218330362956498Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of the electronic library, business office automation and Internet, a large amount of unstructured data has been accumulated in DBMS(Data Base Management System). It's very difficult to ensure the consistency of the data and its full-text index, if the index is outside DBMS. So index built outside DBMS isn't suitable for those applications that are sensitive to performance or flexibility. To combine the data and its index organicly is discussed in Database-Information Retrieval Integration (DB-IR Integration) field in a mainstream way.In order to retrieve vast amounts of unstructured data quickly, we need to use the full-text index technology from Information Retrieval (IR) domain. Although many data structures can be used to implement full-text index, now the mainstream is to use the inverted index. The performance of index based on single-segment index, used in existing articles on organic combination, is not very well. Full-text index implemented by multi-segments index(one or more inverted indexes) is proposed by this article to improve the performance of building, updating and querying. Improvements on index segment structure, key of source table and the sequence number of key stored in word list of index segment and deleting information stored in bit vector, are also introduced to enhance the performance of querying and deleting on index. A performance advantage, in both indexing and querying, of the full-text index proposed by this article, over the existing DBMS in the full-text index, has been verified by an experiment. And how to implement the multi-segments index by B+-Tree, a widely used data structure in DBMS, is also discussed. In the end of this article, a concurrency control and recovery mechanism is designed to ensure the ACID properties of muti-segments index related transations.
Keywords/Search Tags:DB-IR, full-text index, inverted index, ACID
PDF Full Text Request
Related items