| Parity-based RAID storage array is a popular solution widely adopted in modern data centers,cloud platforms and supercomputer centers.Conventional RAID array based on HDDs cannot meet the requirements of modern applications on performance.Flash-based Solid state drives(SSDs)have been widely used because of low latency,high throughput and low power consumption.However,there is serious performance variance problem when simply deploying parity-based RAID upon SSDs.SSDs exhibit long perceivable latency spikes due to internal background maintenance activities like write buffer flushing or garbage collection.The lengthy procedure of write processing in RAID aggravate the performance variance of SSDs.Besides,I/O request bursts in a short time from applications can greatly increase the queuing time,which is also responsible for latency variance.Existing researches on SSD arrays lack comprehensive analysis and targeted improvement on the causes of performance fluctuations.This thesis explores the construction of SSD arrays which can provide consistent low latency.The average and tail latencies of SSD array is significantly reduced by breaking the resource isolation between multiple RAID groups on storage servers,introducing replication write buffer and adding request redirection mechanism to side-step the degraded SSDs.The main contributions of the thesis are as follows:(1)Conducting massive preliminary experiments and the results provide some beneficial insights to shed light on the solution of the latency problem on SSD arrays.First,analysis and statistics on a large number of real workloads reveal that request bursts are common in workloads.But bursts of multiple workloads usually interleave due to the the sparse and irregular distribution of bursts.Second,the reason and characteristics of latency spikes caused by background activities in SSDs are deeply explored and summarized.Last,the prolonged write procedure introduces more software overhead,exacerbating the latency degradation and variance caused by the underlying SSDs.(2)Proposing a new SSD RAID architecture called Fusion RAID,which can significantly reduce both mean latency and tail latency.Fusion RAID utilizes the large commodity enclosures and shares the storage resources of multiple co-located RAID volumes to build up a storage pool with elastic data placement.Fusion RAID uses replication as a prelude to RAID so as to remove the complicated parity updating process from critical path of writes.Data will be first written in replication and be lazily converted into RAID in background so as to achieve higher space utilization.Moreover,Fusion RAID takes some strategies to alleviate write amplification in such replication boosting.Fusion RAID detects a degraded SSD via a lightweight scheme based on the state of dispatched requests and redirects I/O requests on a degraded SSD to other more responsive SSDs.(3)Implementing a Fusion RAID prototype by a Linux module atop generic block layer,and evaluating it with comprehensive experiments.Experimental results show that Fusion RAID can reduce mean latency and the p99 latency by average 59.15% and 88.08%respectively compared to conventional RAID. |