Font Size: a A A

Techniques to explore time-related correlation in large datasets

Posted on:2003-12-27Degree:Ph.DType:Dissertation
University:Louisiana State University and Agricultural & Mechanical CollegeCandidate:Dua, SumeetFull Text:PDF
GTID:1468390011982822Subject:Computer Science
Abstract/Summary:PDF Full Text Request
The next generation of database management and computing systems will be significantly complex with data distributed both in functionality and operation. The complexity arises, at least in part, due to data types involved and types of information request rendered by the database user.;Time sequence databases are generated in many practical applications. Detecting similar sequences and subsequences within these databases is an important research area and has generated lot of interest recently. Previous studies in this area have concentrated on calculating similitude between (sub)sequences of equal sizes. The question of unequal sized (sub)sequence comparison to report similitude has been an open problem for some time. The problem is an important and non-trivial one.;In this dissertation, we propose a solution to the problem of finding sequences, in a database of unequal sized sequences, that are similar to a given query sequence. A paradigm to search pairs of similar, equal and unequal sized, subsequences within a pair of sequences is also proposed. We put forward new approaches for sequence time-scale reduction, feature aggregation and object recognition. To make the search of similar sequences efficient, we put forward indexing technique to index the unequal-sized sequence database. We also introduce a unique indexing technique to index identified subsequences within a reference sequence. This index is subsequently employed to report pairs of similar subsequences, when presented with a query sequence.;Our experimental results have depicted that relative amplitudes of first few frequencies (which are then employed for indexing) tend to behave similarly after time-scale reduction by using the proposed technique. The implementation has also exhibited about 27% reduction in query processing time, on a database of equal sized time sequences, of proposed approach over previous approach in this area.
Keywords/Search Tags:Time, Database, Sequence, Technique, Sized
PDF Full Text Request
Related items