Font Size: a A A

Research On Efficient Data Storage And Intelligent Computing In Cloud Environment

Posted on:2022-08-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:L J YinFull Text:PDF
GTID:1528307169476324Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the big data industry,intelligent computing(represented by deep learning)has become an emerging big data analysis and processing method.In recent years,data storage and intelligent computing have increasingly relied on cloud computing technology to provide basic support.In response to the credibility and dynamic challenges in the public cloud environment,in this thesis we study efficient,reliable,and elastically scalable data storage and intelligent computing technologies,including: high-availability collaborative storage based on an adaptive meta-information bit tree,high-security log-structured storage based on verifiable index hash trees,flexible scheduling framework for deep learning computing with GPU-CPU coordination,and memory access acceleration for multi-core CPU-based deep learning.To address the reliability problem of data storage in the cloud,we propose a DMcache extension called MapperX.Currently,the SSD-HDD hybrid architecture based on the Linux kernel DM-cache is widely used for data storage in the cloud environment.HDD is used as the main storage device to persist all data,and SSD is used as the HDD cache to improve overall data I/O performance.The asynchronous metadata maintenance mechanism of DM-cache prevents the dirty-bit information of SSD cache blocks from being updated in a timely manner.As a result,the recovery time in the event of a failure is too long,resulting in low availability of the DM-cache system.In order to solve this problem,MapperX has designed an adaptive meta-data bit tree(ABT,Adaptive metadata Bit Tree)to synchronize and maintain the metadata of dirty bits in a hierarchical tree structure.MapperX describes the distribution of dirty bits by adaptively adding or removing leaves in different levels of ABT.The paper controls the addition and deletion of ABT leaves based on the service-level agreement(SLA,Service-Level Aggrement)of persistent delay,and realizes the adaptive metadata update granularity adjustment.Experimental results show that the collaborative storage mechanism based on MapperX is far superior to the existing DM-cache mechanism in terms of failure recovery time,and only introduces negligible metadata persistence overhead.To address the security problem of data storage in the cloud,we propose SwornDisk,an efficient data encryption storage scheme.SwornDisk realizes the confidentiality,integrity,freshness and anonymity protection of data I/O based on the LSM tree(logstructured merge tree)and MHT tree(Merkel hash tree)structure.For write operations,SwornDisk persists the data to the physical disk by appending to the log(log).Different historical versions of the same logical address data are recorded in different physical locations(that is,remotely updated),so the attacker cannot pass Roll back the data of a certain physical location to a certain historical version to attack.SwornDisk stores the mapping of logical address(LBA)to physical address(PBA)and the key and MAC(message authentication code)of the data in the memory structure of the LSM tree,while the persistent storage structure(SSTable)of the LSM tree uses MHT Encryption to ensure its security.Experimental results show that SwornDisk has almost no impact on I/O performance under the premise of significantly improving data security.To address the dynamic elastic scheduling problem of GPU/CPU collaborative computing,we propose Elastic Scheduler(ES),an efficient GPU-CPU collaborative deep learning computing elastic scheduling framework.ES has proposed a new local gradient accumulation algorithm,which effectively solves the problem of CPU/GPU calculation speed mismatch and long-term momentum compensation in the dynamic calculation process.ES supports collaborative computing(different types of GPU and CPU devices can be used for deep learning calculations),and dynamic computing(the number of GPUs and CPUs can dynamically change over time during the calculation process).In order to solve the problem of speed mismatch between GPU and CPU,ES uses the local gradient accumulation algorithm to accumulate local gradients on the GPU to simulate multiple virtual GPUs.The sum of the throughput of the virtual GPUs is equivalent to the physical GPU,and each virtual GPU The speed drops to 1/n of the physical GPU(n is the number of virtual GPUs),so that the speed of the virtual GPU matches the speed of the CPU,and then the virtual GPU and the physical CPU are synchronized to achieve parallel computing,thereby solving the problem of collaborative computing.In dynamic computing scenarios,the action of greatly adjusting the number of devices will cause a long-term momentum compensation process and reduce the accuracy of the model.The paper uses a local gradient accumulation algorithm to maintain the stability of the overall batch when the number of devices increases significantly,thereby protecting The convergence accuracy of the model is improved.Experimental results show that ES can effectively support flexible scheduling of deep learning computing tasks on GPUs and CPUs in the cloud.To address the memory access competition problem of multi-core CPU deep learning computing,we propose ParaX,a memory access acceleration method for multi-core CPU deep learning.ParaX uses ”One-Instance-per-Core” instead of the traditional ”single CPU single instance” method to allocate deep learning instances to each CPU core to perform data parallelism,allowing each core to independently pair The data is processed in batches,thereby avoiding the core synchronization barrier when each layer of the DNN model is executed.The paper divides the network layer in DNN into two categories,namely,computationally intensive layers that perform complex arithmetic operations(such as convolution and matrix multiplication),and memory-intensive layers(such as BN layer and activation layer).The ”example” method realizes the mixed execution of memory-intensive and computationally-intensive network layers and bandwidth sharing between different layers,which greatly improves the memory bandwidth utilization of the CPU.ParaX adopts a synchronous SGD strategy to update the model parameters synchronously at the end of each iteration in the model training process.In response to ParaX’s unique CPU multi-core parameter synchronization communication requirements,a gradient server communication mechanism supporting NUMA(non-uniform memory access)architecture is designed,and shared memory is used to effectively reduce the CPU parameter synchronization communication overhead.Experimental results show that ParaX can significantly improve the performance of deep learning model training and inference on multi-core CPUs.
Keywords/Search Tags:Cloud computing, data storage, intelligent computing, DM-cache, highly-available heterogeneous collaborative storage, highly-secure log-structured storage, GPU-CPU collaborative elastic scheduling, many-core CPU memory access bandwidth bottleneck
PDF Full Text Request
Related items