A Number Of Cutting-edge Issues In Algorithm Design | | Posted on:2010-08-19 | Degree:Doctor | Type:Dissertation | | Country:China | Candidate:H Sun | Full Text:PDF | | GTID:1118360302479038 | Subject:Computer software and theory | | Abstract/Summary: | PDF Full Text Request | | In this thesis we will study the algorithm design of various problems in the data streaming model, computational geometry, expanders and computational biology.For the application of database management system and Internet traffic some quantities are desired within fast running time and little space complexity. After the fundamental work of N. Alon et al. in 1996, data streaming algorithms have been studied extensively. In this thesis we shall consider the following three problems.On the range-efficient F0 estimating problem, we improve the previously best known result proposed by A. Pavan et al. in 2005 and give two improved algorithms. Using the streaming reduction technique we show that our algorithms can be used to give a better algorithm for the Maximum Dominating Norm problem.The second problem is to compare the similarity between different XML documents. Previous work used tree-edit distance to characterize this quantity. To give a more natural characterization we propose the Hamming distance to compare the correlation between different XML documents. Relying on N. Nisan's pseudorandom generators and stable distributions, we shall present the algorithm of computing this quantity in the data streaming model. Experimental result shows that with high probability this algorithm can give the desired approximation of the exact value.To gather voting information in Internet traffic Chapter 4 discusses the drawback of icebergs, heavy hitters and other quantities. Moreover we propose the notion of real icebergs. Based on error functions, Count-Min sketches as well as Loglog counting sketches, we give a probabilistic algorithm for evaluating the real icebergs in the poly-logarithmic space.The second part considers the complexity and the approximation algorithms for constructing the Minimum Manhattan Network (MMN). In the first paper of MMN, J. Gudmundsson et al. asked the following three open problems: (1) Whether or not this problem is NP-complete; (2) Whether or not a PTAS exists for this problem; (3) Design a 2-approximation algorithm for this problem.Chapter 5 shall give a positive answer of J. Gudmundsson et al.'s first problem, i.e. the decision version of the Minimum Manhattan Network problem is strongly NP-complete. The rest of this chapter is to present the reduction from 3-SAT problem as well as the correctness proof of our reduction.In Chapter 6 we present an O(n log n)-time 2-approximation algorithm for constructing the Minimum Manhattan Network. Compared with the previously best known 2-approximation algorithms that rely on dynamic programming or linear programming, we prove that the greedy strategy suffices to obtain a 2-approximation algorithm.The third part is to construct almost-Ramanujan graphs. With the help of Ramanujan Conjecture, G. A. Margulis et al. presented the first algorithm for constructing Ramanujan graphs in 1970s. Following this line, there has been a number of work on this topic. Among those papers, most algorithms rely on number theory and topology. On the other hand, computer scientists have tried to give a combinatorial construction of Ramanujan graphs. Chapter 9 begins with a more natural generalization of Zig-Zag product. As the simplification of A. Ben-Aroya et al.' work in 2008, we shall present a fully constructible algorithm for constructing almost-Ramanujan graphs.The last part of this thesis is to consider the DNA sequence coding in computational biology. Compared with the traditional coding theory, DNA sequence coding problems need to satisfy not only Hamming distance, but also other restrictions, e.g. free energy constraint, consecutive base constraint, and so on. In 2005, M.-Y. Kao formalized this problem and proposed some probabilistic algorithms for solving this problem. In Chapter 10 we give fast deterministic algorithms to design DNA words from two different approaches—the first one uses expander codes and explicitly constructible families of Ramanujan graphs, whereas the other relies on expectation-based techniques. We show that within the same running time as the randomized algorithms, the resulting codeword length is significantly smaller than the ones presented by M.-Y. Kao et al. | | Keywords/Search Tags: | data streaming model, minimum Manhattan networks, complexity, approximation algorithms, Ramanujan graphs, DNA words, coding theory | PDF Full Text Request | Related items |
| |
|