Font Size: a A A

Design And Implementation Of A Highly Available RDMA Solution For Data Centers

Posted on:2024-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y X ChenFull Text:PDF
GTID:2568306932962049Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of cloud computing technology,more and more distributed applications are migrating from high-performance computing clusters to cloud data center deployments,which require microsecond latency and tens or even hundreds of Gbps throughput communication performance between data center hosts.The legacy kernel TCP/IP protocol stack is no longer able to meet such requirements,so the focus of current research has shifted to Remote Direct Memory Access(RDMA),a hardware I/O technology that achieves low latency,high throughput,and near-zero CPU overhead,which is the current benchmark for communication performance between data center hosts.While RDMA can meet the communication performance needs of cloud data center applications,high availability is still lacking for RDMA.To improve the availability of RDMA,this thesis proposes a highly available RDMA solution based on active and standby NIC switching,called DTS.First,since current data center hosts are usually equipped with two NICs,RDMA NIC,and Ethernet NIC,DTS uses the Ethernet NIC as a backup NIC and switches to the Ethernet NIC to continue communication when a failure,such as NIC failure or network failure,is encountered.In order to save the communication states on the host and share them between the two NICs,DTS designs a software stack that not only adapts to different types of NICs in the data center but also is compatible with existing RDMA applications.Second,this thesis implements a prototype of DTS,called libdts,and analyzes the performance bottlenecks of the naive implementation of libdts were analyzed and optimized using three optimization techniques:Optimized Data Copy,Batch Packet,and Accumulate CQEs.Finally,this thesis builds a testbed for experimental evaluation and verifies that DTS outperforms native RDMA in both NIC failure and network failure scenarios in terms of availability.
Keywords/Search Tags:Data Center Network, RDMA, High Availability
PDF Full Text Request
Related items