Font Size: a A A

Research On Congestion Control And Flow Scheduling Technology For Cloud Datacenters

Posted on:2020-11-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:W B XieFull Text:PDF
GTID:1368330590950407Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As the business demand for cloud-based services shows an explosive growth,a large number of traditional businesses and applications are gradually moving to the cloud data centers,which has led to the continuous growth of the market scale based on cloud services.Meanwhile,domestic and foreign Internet companies have also launched their own cloud platform to seize the market.Hence,the cloud data center,which is the key to cloud services,has naturally become the focus of research in industry and academia.However,with the explosive growth of tenants and applications,and the emergence of new service models and technologies,cloud data centers have undergone tremendous changes from service models to soft and hardware organizational structures.For example,due to the continuous increase of the network bandwidth,the traditional coarse-grained congestion control mechanisms and flow scheduling schemes are difficult to provide satisfying performance.These changes from the number of users and application models to the internal structure of data centers bring new problems and challenges to Quality of Service(QoS)of cloud data centers.This paper focuses on the problems and challenges brought by these new changes in cloud data centers,and then proposes two effective congestion control mechanisms and an effective flow scheduling scheme to optimize low-latency and high-concurrency network transport for cloud data centers.In order to dress the problem that current Explicit Congestion Notification(ECN)cannot provide accurate congestion feedback information for the data center transport protocols,this paper proposes a RTT-based Explicit Congestion Notification mechanism R-ECN.The distinct feature of cloud data center transport protocols is that they do not explicitly distinguish long and short flows in the whole network transport process.They usually have wide applicable range and good deployability,and thus are widely used in cloud data centers.Based RTT of each packet,R-ECN uses the Gradient of RTT Deviation(GRD)algorithm to calculate the gradient between the RTT of each packet and the average RTT of transmission path.Then,R-ECN utilizes this gradient to dynamically adjust the ECN threshold.Thereby,according to the change of network condition,R-ECN is able to provide accurate congestion feedback information to improve the transport performance of data center transport protocols.The experimental results show that R-ECN can further improve the transport performance of data center transport protocols compared with the current ECN.Moreover,R-ECN can also effectively improve the bandwidth utilization,for example,the number of packet loss of R-ECN-based DCTCP is 1.95 X less than that of ECN-based DCTCP.In order to dress the problem that current ECN cannot provide accurate and differentiated congestion feedback information for the multi-queue flow scheduling schemes of data center,this paper presents Queueing-Delay-based ECN(QD-ECN).The prominent advantage of multi-queue flow scheduling schemes is that they implement priority flow scheduling based on the priorities of tenants or the characteristics of flows.Although the applicable range is limited and the deployability is poor,multi-queue flow scheduling schemes have always been the focus of research,owing to good transport performance.According to the differentiation of the average queueing delay between different priority queues,QDECN allocates multiple differentiated ECN thresholds for all queues.Meanwhile,QD-ECN employs the Gradient of Queueing Delay Deviation(G-QDD)algorithm to calculate the gradient between the queueing delay of each queue and the average queueing delay.And then QD-ECN adopts the gradients of different priority queues to dynamically adjust multiple differentiated ECN thresholds.Hence,QD-ECN is able to provide accurate and differentiated congestion feedback information for the multi-queue flow scheduling schemes of data center.The experimental results show that QD-ECN can further improve the performance of multi-queue flow scheduling schemes,and can effectively alleviates packet loss.For example,compared with ECN-based PIAS,QD-ECN-based PIAS reduces the 99.9th percentile FCT of short flows by up to 3.06 X.Moreover,the experimental results also demonstrate that QD-ECN has better robustness than ECN.In order to dress the problem that current flow scheduling schemes cannot simultaneously achieve wide applicable range,good deployability and good transport performance,this paper presents Host-based Flow Scheduling Scheme SPQ.Current flow scheduling schemes take two extreme approaches.On one hand,the information-aware schemes focus on achieving good performance,while largely overlooking the flexibility and complexity of design.On the other hand,information-agnostic schemes while making no assumptions about the availability of detailed flow information and hence applicable to wide range of applications,offer limited performance.To dress the problem,this paper presents Host-based Flow Scheduling SPQ,an information-agnostic and readily deployable flow scheduling scheme,which provides near-optimal flow completion times(FCT)for latency-sensitive applications and effectively harnesses the long-tail behaviors of flows.Unlike the existing in-network priority schemes,SPQ enables host-based,fine-grained flow scheduling,leaving the in-network queuing mechanism simple.SPQ does not make any assumptions about the availability of any flow information and hence,can be applied to any types of datacenter applications.Moreover,SPQ approximates the Least Attained Service(LAS)scheduling discipline and hence is a near-optimal solution.Meanwhile,SPQ utilizes two novel feedback adjustment mechanisms to alleviate the possible negative impact of long flows on short flows.The results demonstrate that SPQ effectively addresses some major limitations of the in-network priority schemes,resulting in the near-optimal performance in reducing the average and tail latency.For example,the average FCT of short flows for SPQ only has a 0-1.1% gap with respect to the ideal information-aware scheme.
Keywords/Search Tags:Cloud Data Center, Congestion Control, Flow Scheduling, Network Transmission, Explicit Congestion Notification
PDF Full Text Request
Related items