Research On GPU Resource Allocation Strategy For Deep Learning Inference Task

Posted on:2023-09-13

Degree:Master

Type:Thesis

Country:China

Candidate:Y B Wang

Full Text:PDF

GTID:2558307070983559

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Edge computing deploys servers at the edge of the network to provide computing services to users.It is characterized by the ability to complete the computing services required by the end device with a low latency.In real application scenarios,an edge server needs to provide deep learning model inference services for a large number of terminal devices.These tasks are characterized by two dimensions,one is high concurrency,and the other is multi-variety.Targeting at the edge server computing capability represented by graphics processing unit(GPU),related works mainly focus on model simplification and resource scheduling.The problem of the former is the loss of accuracy,while the latter research ignores the fact that batch processing can greatly improve the throughput of edge servers,and relies on numerical and simulation experiments to verify the scheduling performance,which is difficult to implement in real systems.Addressing the above problems,this paper proposes a single type of task scheduling strategy based on batch processing,and a computing resource allocation strategy for multiple types of tasks.The main contributions can be summed up as following:1.we study the scheduling strategies for a single kind of task.Through experimental tests,it is verified that batch processing can greatly improve the task throughput.Based on this,we propose a task scheduling scheme based on dynamic batch processing.Then we analyze the relationship between task arrival rate,batch size and throughput in the system.In the following,we define the optimization problem with the goal of maximizing the throughput.Then we design a low-complexity approximate solution algorithm to find near-optimal size of batch.Finally,we build an experimental bed to evaluate the proposed solution.2.We study the resource scheduling strategy for various kinds of tasks.Considering the problem that multiple types of inference tasks will affect each other during execution,making the system state space too large,we propose a scheduling strategy based on deep reinforcement learning.Firstly,we verify the feasibility of using virtualization technology to divide the computing power of graphics processing unit,and then propose the scheduling system architecture,including task manager and resource scheduler.After that,we define an NP-hard optimization problem describing various types of task scheduling.And we design a scheduling algorithm based on deep reinforcement learning and the corresponding task priority algorithm.Finally,we evaluate the proposed scheme on the experimental bed,and the results show that the strategy proposed in this paper can achieve better performance compared to the existing strategies.

Keywords/Search Tags:

Edge intelligence, Task scheduling, Resource scheduling, Queuing model, Deep reinforcement learning

PDF Full Text Request

Related items

1	Task Scheduling And Resource Management Based On Edge Intelligenc
2	Research On Adaptive Cloud Resource Scheduling Based On Reinforcement Learning
3	Research On Multi-Access Edge Computing Based Task Scheduling Strategy In IIoT Networks
4	Research On MEC Task Offloading And Resource Scheduling Based On Deep Reinforcement Learning
5	Research On Multi-Objective Optimization Method Of Edge Cloud Task Scheduling Based On Deep Reinforcement Learning
6	Research On User Task Collaborative Scheduling And Optimization With Multi-Edge Computing Nodes
7	Large-Scale Stream Processing Task Resource Scheduling Method Based On Deep Reinforcement Learning
8	Research And Implementation Of Resource Monitoring And Scheduling For Cloud-edge Collaboration
9	Research On Edge Computing Task Scheduling Method Based On Reinforcement Learnin
10	Research On Resource Scheduling Model Of Industrial Internet Of Things Based On Osmotic Computing