Font Size: a A A

Design And Implementation Of A Distributed File Storage Service Platform Based On Hadoop

Posted on:2013-12-27Degree:MasterType:Thesis
Country:ChinaCandidate:J N ChuoFull Text:PDF
GTID:2248330395476609Subject:Aerospace and information technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet applications, information and data on the Internet grow explosively. How to organize and store the massive data over the Internet has become an urgent issue. Currently, there exist a large number of free and cheap storage resources in the network, either on the Internet or the Intranet. Making use of the numerous free storage resources in the network is an effective means of providing a large scale storage infrastructure.The distributed file system is a way to make use of distributed storage resources. However, traditional distributed file systems, such as HDFS of the Hadoop project, run on cluster systems with stable and similar nodes. Deploying a traditional distributed file system directly on the dynamic network with free nodes may result in issues such as low storage space utilization, poor adaptability to the network, and low credibility of storage nodes. Based on the Hadoop open-source project, this thesis studies on a novel distributed file storage model that adapts to the dynamic network environment, and designs and implements QDFS, a distributed file storage service platform that employs a data redundancy policy based on recovery volumes and a QoS-aware data placement strategy. The main contribution of the thesis includes:(1) The distributed file storage system is based on the dynamic network environment. It makes use of free and cheap storage resources in the network and hence reduces the total cost of ownership.(2) The backup mechanism is based on recovery volumes; it greatly reduces the storage space for the redundant information of a file.(3) The system establishes a tree-like system model based on hierarchical NameNodes to solve the issue that different clusters could not be shared in the same distributed system.(4) A file transfer tool for clients is developed to solve the problem that the Hadoop client-side software could not run in the Windows environment.
Keywords/Search Tags:Distributed file storage, Hadoop, Redundancy, Quality aware, Cloudstorage
PDF Full Text Request
Related items