Deep Neural Networks Inference Task Deployment Method In Edge Computing Environment

Posted on:2023-06-11

Degree:Doctor

Type:Dissertation

Country:China

Candidate:W C He

Full Text:PDF

GTID:1528306911995319

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

With the rapid development of network,deep learning application based on Deep Neural Networks(DNN)model sinks to the edge.DNN inference tasks are deployed on edge computing nodes and provide intelligent services with real-time inference and decision-making capabilities,such as intelligent security and intelligent inspection.In the process of providing services,the long delay of DNN inference task execution or service interruption caused by limited edge resources or device mobility seriously affects the timeliness and continuity of intelligent services.Therefore,it is necessary to study the DNN inference task deployment method in edge computing environment,to fully reuse edge limited resources and effectively take into account the dynamic nature of edge environment to ensure intelligent service quality.It is an important method to improve the computing efficiency of DNN inference tasks by partitioning them into multiple dependent subtasks and deploying them on different edge nodes.The existing DNN inference task deployment research has made some achievements,but there are still the following technical challenges:(1)The existing task partitioning method for chain topology is difficult to partition DNN inference tasks with directed acyclic graph topology characteristics in fine granularity;(2)When multiple DNN inference tasks are deployed at the same time,the competition for limited computing and communication resources tends to lead to delay deterioration,which affects the timeliness of services.At present,the research on deployment of multiple DNN inference tasks combined with task partitioning is insufficient;(3)Existing studies rarely consider the deterioration of communication conditions and data transmission efficiency caused by the limited service range of edge nodes in the process of device movement,which leads to the deterioration or even interruption of DNN inference task completion quality.In view of the above technical challenges,starting from ensuring the timeliness and continuity of intelligent services,this paper conducts research on DNN inference task deployment in edge computing environment,breaking through key technologies such as fine-grained partitioning and dynamic deployment of DNN inference task.The specific innovative research contents are as follows.(1)In view of the problem that existing task partitioning methods are difficult to partition DNN inference tasks with directed acyclic graph topology characteristics in fine granularity,a DNN inference task partitioning and deployment method based on Graph Cuts is proposed.Based on the dependence of data and computation between sub-tasks in DNN inference process,a DNN inference task partitioning and deployment model with directed acyclic graph topology characteristics in distributed edge-device collaborative architecture is constructed.On the basis of this,a DNN inference task partitioning and deployment problem oriented to optimal delay and energy consumption is constructed.Considering that the above problem belongs to mixed integer nonlinear problem,it is further decomposed into two sub-problems:DNN inference task partitioning and computing resource allocation.A DNN inference task partitioning algorithm based on Graph Cuts is designed,and an auxiliary graph is constructed to transform the task partitioning problem into a minimum cut problem.Max-Flow/Min-Cut algorithm is used to solve the inference task partitioning decision.On this basis,a computing resource allocation algorithm oriented to optimal delay and energy consumption is designed to solve the computing resource allocation decision.Experimental results show that the proposed method can partition and deploy DNN inference tasks in fine granularity,utilize limited and distributed resources,balance task delay and system energy consumption,and effectively guarantee service timeliness.(2)In view of the problem that the delay degradation is caused by the competition of tasks for limited computing and communication resources in the deployment process of multiple DNN inference tasks,a multiple DNN inference task partitioning and deployment method based on Markov approximation is proposed.Firstly,the deployment architecture of edgeedge collaborative multiple DNN inference tasks is constructed.Combined with the differentiated resource requirements of DNN inference tasks,the deployment process of multiple DNN inference tasks including task partitioning,edge node selection and computing resource allocation is designed,and the multiple DNN inference task deployment problem oriented to optimal delay is modeled.Considering that the above problem belongs to combinatorial optimization problem,a heuristic DNN inference task partitioning algorithm is designed to obtain task partitioning decisions.Secondly,a log-sum-exp approximation is used to transform the edge node selection and computing resource allocation problem into a distributed solution problem,and a distributed solution algorithm based on timereversible Markov chain is designed to obtain the edge node selection and computing resource allocation decision.Experimental results show that the proposed method can flexibly schedule the available resources of devices and edge nodes,adapt to the characteristics of resource dispersion,meet the differentiated resource requirements of multiple DNN inference tasks,and effectively guarantee the service timeliness.(3)In view of the problem that the limitation of the service range of edge nodes causes the deterioration of communication conditions and the decrease of data transmission efficiency in the process of device movement,which leads to the decline or even interruption of service quality,a DNN inference task dynamic deployment method based on Distributed Proximal Policy Optimization is proposed.Firstly,the edge-device collaborative DNN inference task dynamic deployment architecture is constructed.According to device location,communication conditions and resource status of accessible edge nodes,the dynamic DNN task deployment process including DNN model caching,inference computing offloading,communication and computing resource allocation is designed.The endto-end delay model and deployment cost model including computing energy consumption,transmission energy consumption and caching overhead are constructed,and a DNN inference task deployment problem oriented to optimal delay and deployment cost is constructed.Secondly,considering the complex decision space and dynamic network environment,the above problem is further transformed into Markov Decision Process.Finally,DNN inference task dynamic deployment algorithm based on Distributed Proximal Policy Optimization is designed.Experimental results show that the proposed method can effectively adapt to the dynamic edge environment,realize the integrated utilization and on-demand distribution of multi-dimensional edge resources,and effectively guarantee the service continuity.

Keywords/Search Tags:

Edge Computing, Inference Task Deployment, Resource Allocation, Quality of Service Guarantee

PDF Full Text Request

Related items

1	Research On Resource Allocation Method Of Mobile Edge Computing Based On Intelligent Inference
2	Research On Task Offloading And Resource Deployment For Mobile Edge Computing
3	Research On Task Aware Service Caching Algorithm In Edge Computing Networks
4	Research On Optimal Resource Allocation And Deployment Technologies In Edge Computing Environments
5	Optimal Placement Of Mobile Edge Computing Servers And Resource Allocation In Edge Networks
6	Research On Task Offoloading And Service Management Technology With Edge-cloud Cooperation
7	Research On Resource Scheduling Optimization Of Satellite Earth Fusion Network Based On Statistical Delay Guarantee
8	Research On Task Scheduling Mechanism Based On Edge Computing
9	Research On Task Offload And Joint Resource Allocation In Mobile Edge Computing
10	Research On QoS And Personalized QoE Optimization Method For Mobile Edge Computing Resource Allocation