Font Size: a A A

Enabling efficient fault tolerance in distributed file systems through erasure codes

Posted on:2012-12-28Degree:M.SType:Thesis
University:Oklahoma State UniversityCandidate:Yu, LiFull Text:PDF
GTID:2468390011968260Subject:Computer Science
Abstract/Summary:
Over the past few years, distributed file systems have been widely used as substantial infrastructures and key components of current large-scale Internet applications. By the nature of storage systems, a lot of existing distributed file systems focus more on the aspect of data availability and fault tolerance. A practical and popular solution to improve the data availability is to create extra data copies. However, a common drawback of this solution is inefficient storage space utilization.;This thesis aims to solve the efficient fault tolerance problem in distributed file systems. Our solution is expected to reduce the space cost while maintaining similar or higher data reliability of the whole system. To achieve the fault tolerance on distributed file systems, this thesis compares a certain number of erasure codes, including traditional Maximum Distance Separable (MDS) erasure codes and Low-Density Parity-Check (LDPC) erasure codes as alternatives. The algorithms to construct applicable erasure codes are presented and illustrated in this thesis.;A simulator is developed for the simulation of data availability model with various parameter settings. Erasure codes construction is implemented in experiment. We evaluate the performance of our solutions according to the following performance metrics: encoding and decoding efficiency, storage space overhead and utilization, and data availability. These evaluations are completed through experiments in a practical environment as well as simulation.;The experimental results demonstrate the validity and effectiveness of the proposed scheme. We have shown that an efficient fault tolerant scheme in distributed file systems can be achieved by applying erasure code technology. Compared to the previous MDS erasure codes such as Reed-Solomon Codes, the family of LDPC erasure codes meets the goal of enabling efficient fault tolerance in distributed file systems with an acceptable trade-off between the extra cost of encoding/decoding time and storage overhead.
Keywords/Search Tags:Distributed file systems, Erasure codes, Efficient fault tolerance, Data availability, Storage
Related items