Font Size: a A A

Distances in random tries via analytic probability: The oscillatory distribution

Posted on:2005-09-11Degree:Ph.DType:Dissertation
University:The George Washington UniversityCandidate:Christophi, Costas AFull Text:PDF
GTID:1450390011950791Subject:Statistics
Abstract/Summary:PDF Full Text Request
Digital trees are data structures that are used for storing data given in the form of strings, which are sequences of symbols from a finite alphabet. They offer many applications in different areas, such as computer science, telecommunications, chemistry, and computational biology, and they are also very common when dealing with algorithms on words. The trie is a popular kind of digital trees, initially proposed for information retrieval. We investigate Delta n, the distance between randomly selected pairs of nodes among n keys in a random trie. In informatics, distances between nodes in a random combinatorial data object are of prime interest, because they are indicative of the speed of communication within the structure. These distances have applications in many other fields as well.; Analytic techniques, such as the Mellin transform and the inverse Mellin transform, as well as an excursion between poissonization and depoissonization are utilized to capture small fluctuations in the mean and variance of these random distances. The mean increases logarithmically in the number of keys, but curiously enough the variance remains O(1), as n → infinity. It is demonstrated that the centered random variable D*n=Dn-&fll0;2 log2 n&flr0; does not have a limit distribution, but rather oscillates between two extremal values.; For several of the functional equations that we encountered for the poissonized version of the problem, a Mellin transform does not exist. The way one can deal with the issue is to find suitable simple shifts that, when subtracted from both sides of the functional equation, ensure the existence of the Mellin transform. One can then work with these shifted equations to go through the Mellin-inverse Mellin transforms and the subsequent depoissonization process.; In the derivation of the mean and the variance we used the Mellin transform in a standard way and we dealt with a fixed set of poles, which is the usual scenario in several similar problems in the area of analysis of algorithms. By contrast, our novel derivation of the asymptotic moment generating function deals with poles that are actually moving with the auxiliary variable of the moment generating function. We believe that this approach is portable to a broader class of problems and that it utilizes the Mellin transform technology to its fullest extent.; It appears that our work can be extended without introducing essential difficulties to the case of data on alphabets larger than binary, which can be useful in other areas too, such as DNA studies.
Keywords/Search Tags:Data, Random, Mellin transform, Distances
PDF Full Text Request
Related items