| With the increasing scale of data storage,massive file sets operations need to be migrated between different devices,storage nodes and even data centers.However,traditional file migration schemes,whether between devices or networks,are essentially based on individual file calls to read and write system calls respectively.Therefore,in the underlying file system,a batch of files are processed one by one,and each file accesses the metadata and data separately,resulting in a large number of random I/O and a decline in overall access efficiency.To solve these problems,a mechanism called BFM(Batch File Aggregated Migration)is proposed.The core idea of this mechanism is to aggregate all metadata and data from file sets,and process them separately.Based on this mechanism,the overall layout of file set data on storage devices can gotten,and the most reasonable order of read and write is optimized,so that several small I/O can be merged in to large and sequential one as far as possible.In order to implement this mechanism,BFM implements batch read-write interface called BFM-r,BFM-w in the kernel.At the same time,BFM designs a series of optimization technologies,including layout-aware read scheduling,unified address management based on order_list,and two-phase reliability guarantee mechanism,which can reduce a large number of discrete I/O while ensuring reliability.Furthermore,the BFM mechanism expands data migration among network nodes.By aggregating and transferring all file data and metadata,and confirming the transmission in batches,the network overhead is reduced,Obviously.The mechanism has been implemented on Ext4 file system.Experiments show that the data migration speed between disk and networks can be increased by 98% and 432%respectively,and the number of read and write I/Os can be reduced to 21% and 64%respectively. |