The Storage Of Small Files In Distributed File System

Posted on:2017-03-21

Degree:Master

Type:Thesis

Country:China

Candidate:J L Yi

Full Text:PDF

GTID:2428330569999032

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Various industries have developed many distributed file systems,such as HDFS,GlusterFS,Haystack,and CEPH,which are suitable for their own fields.After investigation and analysis,we found that these filesystems are designed for large file storage.Once involved in large-scale small file IO,these distributed filesystem's performance is poor,or even unable to work.Therefore,this paper starts from the storage format and fault-tolerant mechanism,and optimizes the IO performance of small files in distributed storage system.The existing filesystem does not support small files well enough.In response to this shortage,this paper describes the storage format and mechanism of small files in data storage server.We use big data block file to store small file and each data block contain a large number of small file.We use the index file to identify each file in the data block.Experimental results show that the optimized file storage format has good IO performance.In traditional file system,each file corresponds to a metadata information.However,once a large-scaled small file access is involved,this approach is extremely limited.Optimized small file storage format can greatly reduce the number of metadata,thus providing the possibility for the cache metadata.In this paper,a cache mechanism is introduced in the distributed file system to reduce the file access delay by storing the index file in the cache.This paper introduces the use of erasure code in small file storage.According to storage mechanism introduced in Chapter 2,the extended block occupies a small space.When data is restored,it is no longer necessary to read other files to decode and restore the damaged files.This greatly reduces the network overhead and computational overhead.

Keywords/Search Tags:

Small file storage, Erasure code, Load balancing

PDF Full Text Request

Related items

1	Research On Small File Storage Technology For WEB Application
2	The Design And Optimization Of Cloud File System Based On Erasure Code
3	Load Balancing Scheme Design Based On Erasure Code Offloading
4	Design And Implementation Of File Multi-Cloud Secure Storage System Based On Web And Erasure Code
5	Research On Performance Optimization Of Erasure-code Storage
6	Research On High Utilization Rate And Strong Scalability Of HDFS Storage
7	Research And Implement Of Distributed Massive Small File Storage Access Optimization
8	Improvement Of Distributed File System TFS In Cloud Storage
9	Research On Data Access Optimizations For Erasure-Coded Storage Clusters
10	Research On Techniques Of Load Balancing In P2P File Storage System