Font Size: a A A

Research And Implementation Of A Small Object Access Performance Optimization On Swift

Posted on:2017-04-18Degree:MasterType:Thesis
Country:ChinaCandidate:X G WangFull Text:PDF
GTID:2308330503987182Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Along with the rapid development and popularization of the internet, a growing number of individual users start using web applications, which generate large amounts of data at the same time. Cloud technology makes more enterprise users select migrating data service to the cloud. Data stored in their data centers are explosively increasing whether it is an individual or enterprise service provider. Web applications, such as e-commerce, social networking, video, produce a large number of small files every day. Traditional storage systems are often designed for large files which have poor performance aimed at small files. So in this paper, we leverage the characteristic of small files storing to complete the optimization of storage system reading and writing based on the experimental platform under Openstack Swift distributed storage systems.Firstly, Swift object services call the underlying file system to read and write data. Storage nodes require reading metadata from disk frequently in the face of small files access randomly, wasting a large amount of disk I/O. In this paper, through combining small files and building an index of files in memory, we can reduce metadata space consuming. All metadata of files can be cached in memory by storage nodes, only one disk I/O operation is needed while accessing small files which can improve the performance of random reading and writing of Swift storage system in dealing with small files. By changing the form of file organization and consolidating file on virtual partition, we can reduce the impact of data migration on bandwidth.Secondly, for distributed storage system, its CDN services always have external cache functionality. Data with temporal locality is always accessed by outside cache which result in the reducing of data cache hit rate of storage system itself. By analyzing the file access logging, calculating the file access mode by data mining, files with strong correlation are merged and stored. Moreover, we prefetch files while accessing files to increase the performance of data storage of Swift storage system. Besides, the impact of error prefetching on system performance can be reduced by authenticate the association of files during caching process.Finally, we test our new system by simulating file reading and writing under different size in Cosbench and verify the improvement effect on system performance, analyze the impact of pre-fetch on system performance by actual access logs. Experiments show that improvement of system bandwidth is more obvious in the face of smaller files based on merging storage. Impact of system throughput on data migration can be reduced by merging storage. Although system bandwidth can be reduced caused by the prefetching strategy based on file association, it can promote the average response time of system.
Keywords/Search Tags:Distributed Storage, Swift, Small File, Prefetching
PDF Full Text Request
Related items