| This thesis explores algorithms for massive graphs. We explore efficient graph algorithms under standard models of processing such large data sets.;The main graph problem considered in this thesis is performing random walks efficiently. Random walks are fundamental across all of computer science ranging from theory, mathematics, distributed computing and web algorithms. Several applications include search, data mining, ranking (such as PageRank), measuring similarity between web pages, maintaining connectivity in P2P networks etc. The work in this thesis has developed the fastest algorithms for performing random walks in the streaming model, and in distributed networks, breaking past the linear-time barrier presented by the sequential nature of random walks. This work improves upon techniques that have been used for decades in both theory as well as practice.;In follow up work on the streaming model presented in this thesis, we consider the problem of graph partitioning. Several offline and few streaming/online algorithms have been suggested for partitioning an input graph into two parts, to approximate the conductance of the graph. While many of these techniques are impractical at the scale of the massive web graphs, some may be implementable. However, when dealing with several hundred million nodes and tens of billions of edges, visualizing one global cut becomes very difficult. What can one say from a partition that separates the graph into two humongous sets of nodes? In this work, we consider the problem of finding what we call cut projections. Given a (possibly small) subset of nodes from the graph, the objective is to partition the set of nodes such that this projects onto an induced cut of small conductance on the entire graph. The hope is that the bounds now strongly depend on the size of the specified set of nodes and only weakly on the size of the entire graph. We show how such cut projections can be obtained on regular graphs when the subset of nodes is chosen at random. While our theorems do not hold when these restrictions are omitted, the technique itself is likely to work well in practice.;In the distributed computing model, we again focus on the problem of performing random walks efficiently. Our results on this model work only on undirected networks (which is usually the case, since the communication paths are bidirectional).;In follow up work under the same congest model of distributed networks, we further improve the algorithm for performing several random walks. Further, we show a lower bound that suggests that a single random walk requires O( ℓ/logℓ ) rounds. Therefore, barring the diameter term, our approach for performing a single random walk is near-optimal. Our algorithm also introduces several new techniques and random walk properties that may be of independent interest. Further, we show how our algorithms for performing random walks efficiently can be used as subroutines for efficient distributed algorithms for two key properties: random spanning trees, and estimating mixing times. We show the fastest known distributed algorithms for both sampling a random spanning tree (which is useful in applications such as routing and generating sparsifiers), and estimating mixing times (which is crucial to understanding the connectivity of the network, a primary concern in peer to peer systems).;Finally, in the online framework of sketch based algorithms, we study the problem of estimating graph distances efficiently. Our objective is to be able to answer distance queries between a pair of nodes in real time. Since the standard shortest path algorithms are expensive, our approach moves the time-consuming shortest-path computation offline, and at query time only looks up precomputed values and performs simple and fast computations on these precomputed values. More specifically, during the offline phase we compute and store a small sketch for each node in the graph, and at query-time we look up the sketches of the source and destination nodes and perform a simple computation using these two sketches to estimate the distance. (Abstract shortened by UMI.)... |