Font Size: a A A

Research And Implementation Of A File System For Massive Small Files Performance Optimization

Posted on:2020-07-30Degree:MasterType:Thesis
Country:ChinaCandidate:W K ZhengFull Text:PDF
GTID:2428330590458322Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the rapid development of technology of big data era,data on Internet grows in explosive speed.Researchs show that most data on network is stored as small file,but metadata access of small files is not well optimized in current local filesystems.Key-value system based on Log-structured Merge(LSM)tree can put metadata together,which improves write performance significantly.Although key-value system brings speed-up in write operations,it also causes read amplification for random read and range scan operations,resulting performance loss.In this paper,we propose KVFS(Key Value File System),a new type file system based on a key-value storage system,to reduce read amplification and improve performance.KVFS uses key-value system to store metadata together with file data of small files,while data of big file is still in local file system.Based on the feature that KVFS access metadata by directory,we designed a new key-value storage system called FlatDB,which is arranged by a flat structure,to store key-value pair of filesystem's metadata.FlatDB uses a directory hashing strategy to organize key-value data,and maintain data of recently accessed directories in a flat KV layer called DTable(Dir Table).DTable adopts a directory-based local ordering strategy: The metadata in same directory is strictly sorted internally,while metadata in different directory use hash to query,which reduces the sorting overhead of the system and can also query quickly.FlatDB takes full advantage of NVM(Non-volatile Memory)to improve access speed of DTable.Append mode are choosed in FlatDB for it translate random write into sequential write and provide better performance.Data in DTable is indexed by hashing the key and offset of data for fast access.FlatDB keeps the file metadata in the same directory together and manage one single layer of SSTable files,which is different from LSM-Tree's multi-Level structure.SSTable files in FlatDB are properly sorted and not overlapped with each other in data range,avoiding compare operations between levels and useless old data.FlatDB improves performance of read and range-read operations,and reduce write amplification.KVFS is implemented based on open source code of TableFS and RocksDB.Experiments show that KVFS can outperform TableFS under same environment by 60.2%,and brings a 2x speed-up particularly in range scan operations.besides,the disk space usage of KVFS is 14% to 50% compare to TableFS.
Keywords/Search Tags:File system, Small files, key-value storage, LSM-Tree, NVM
PDF Full Text Request
Related items