| With the advent of the era of big data,the data volume of applications dramatically increases,which leads to higher requirements and challenges for the storage system.Copy-on-write(Co W),write-ahead logging(WAL),data duplication,and data copy/migration cause large amounts of duplicate writes in multiple storage layers including applications and the kernel.And different layers may be coupled with each other,which significantly degrades the system performance.Moreover,due to the limited write endurance of non-volatile memory(NVM),duplicate writes also degrade the lifetime of NVMs.Eliminating duplicate writes to improve the performance and lifetime of the storage system becomes a research hotspot.Address remapping is widely used in the storage system,such as the memory address mappings of virtual addresses to physical addresses introduced by the virtual memory technique,and the solid-state drive(SSD)address mappings of logical addresses to physical addresses introduced by the out-of-place update of flash memory.Address remapping provides an effective method to eliminate duplicate writes of the storage system.Thus,conducting research on address remapping cross-layer synergistic write optimization methods to improve the performance and lifetime of the storage system is of great importance.The CoW is widely used in applications and OS kernels to protect data consistency or reduce memory consumption.However,Co W brings a large amount of memory writes,long-tail latency,and the compound Co W problem between applications and the kernel.A memory address remapping-based Co W optimization method,called Re Co W,is proposed to perform cross-layer synergistic optimization on applications and the kernel.Re Co W utilizes memory address remapping to complete the "virtual" data copy of applications.By setting the associated virtual memory as write-protected,the actual data copy is offloaded to the kernel.When the application writes data to the write-protected memory,the kernel Co W is triggered to complete the actual data copy on demand.Taking the hash table resizing(generalized Co W)as an example,the prototype of Re Co W is implemented and evaluated.The experimental results show that compared with conventional hash table resizing,Re Co W reduces memory writes by up to 32% and reduces the tail latency by up to 96.9%.Compared with the compound Co W,Re Co W reduces memory writes by up to50% and reduces the tail latency by up to 99.4%.The WAL is widely used in many databases to guarantee data consistency,but it introduces duplicate writes during checkpoint.An SSD address remapping-based WAL optimization method,called SW-WAL,is proposed to perform cross-layer synergistic optimization on applications,the kernel,and the SSD.SW-WAL uses SSD address remapping to complete WAL checkpointing writes,eliminating duplicate writes.Meanwhile,when database applications write transactional data into the WAL file,SW-WAL acquires the corresponding mapping and transactional semantics from applications and the kernel.These semantics are delivered to the SSD and stored in the flash to guarantee the transactional atomicity and SSD address mapping consistency.Taking the SQLite database as an example,the prototype of SW-WAL is implemented and evaluated.Experimental results show that compared with traditional SQLite,SW-WAL improves performance by up to 62%,and reduces flash writes and erases by up to 45%and 46%.Compared with the atomic-write SSD,X-FTL,SW-WAL improves performance by up to 32% and reduces flash writes and erases by up to 23% and 25%.To eliminate duplicate writes of the storage system,the SSD address remapping-based write-optimized method,called Remap-SSD-LH,is proposed to perform cross-layer synergistic optimization on applications,the kernel,and the SSD.It uses an SSD address remapping primitive to eliminate duplicate writes of applications and the kernel.Meanwhile,it maintains a local log based on NVM-flash hybrid storage for each garbage collection unit to record the corresponding remapping metadata,which guarantees the mapping consistency of the SSD and achieves efficient mapping management and lookup.Remap-SSD-LH is implemented and evaluated in three scenarios.Comprehensive experimental results show that compared with traditional SSD,Remap-SSD-LH improves performance by up to 5.4x and reduces flash writes by up to 72.7%.Compared with address remapping SSDs that maintain a global remapping log,Remap-SSD-LH improves performance by up to 53%. |