Research And Implementation Of Synergistic Inference Acceleration For Edge Intelligence Applications

Posted on:2023-04-08

Degree:Master

Type:Thesis

Country:China

Candidate:H T Wang

Full Text:PDF

GTID:2558307061953669

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of artificial intelligence,deep neural network(DNN),as the representative algorithm,has been widely studied and applied in many intelligent application scenarios,such as image classification,natural language processing,and video analysis.DNN inference in these applications is with the characteristics of task-intensive and computationally intensive,requiring massive resources.However,most end devices are only equipped with weak processing power.Even cloud computing will lead to unacceptable communication latency that severely limits the performance of DNN inference.To address the shortcomings of the existing computing modes,edge intelligence has emerged as a prospective paradigm for accelerating DNN inference in recent years.Based on the advantages of edge computing,edge intelligence migrates intelligent services from the remote cloud to the network edge,improving the response speed and reliability of services.The existing work in edge intelligence mainly focuses on model optimization and synergistic inference optimization.However,the existing methods of model compression are prone to irreversible loss of accuracy.Although the model early exit methods ensure acceptable accuracy,it cannot cope with the diversity of data distribution,which leads to considerable computation redundancy.In terms of synergistic inference optimization,the existing methods require significant data transmission overhead between devices,which suffer from performance deterioration in the dynamic network environment.In addition,the limited edge resources and the heterogeneous end devices put forward tougher challenges for the synergistic inference acceleration.To tackle the above problems,this thesis focuses on the scenarios of edge intelligence and introduces the early exit mechanism of multi-exit DNN into the end-to-edge synergistic inference environment.This thesis achieves DNN inference acceleration through optimizing the exit setting of multi-exit DNN and the offloading decision of inference tasks,which specifically includes three parts:(1)Aiming at the problem of model computational redundancy caused by the diversity of data distribution,a data-aware multi-exit DNN exit selection mechanism is proposed.This mechanism obtains data-aware information by collecting the exit probability distribution of the model’s exits and the calculation amount distribution of the model’s layer.Based on the data-aware information of multi-exit DNN,this mechanism adaptively determines exit settings,including the placement and number of multiple exits,to minimize the computation amount of multi-exit DNN.(2)Regarding the heterogeneity of end devices,network dynamics,and limited edge resources,a model partition and resource allocation joint optimization mechanism for end-to-edge synergistic inference is proposed.This mechanism considers the model characteristics,the network status,and the resource constraints simultaneously to reduce the average execution time of tasks.The joint optimization achieves adaptive slicing models to determine the computing proportion of end devices and edge servers,and then realizes resource allocation to meet the requirements of heterogeneous inference tasks.(3)Based on the above theoretical research,a synergistic inference acceleration prototype system for edge intelligent applications is designed and implemented.This system adopts an end-to-edge synergistic framework which is mainly composed of a model optimization module and a joint decision-making module.This system is finally deployed and evaluated in the edge computing environment.The experimental results verify the performance improvement of this system on edge intelligent applications.To sum up,this thesis conducts theoretical research and system implementation on the DNN inference acceleration in edge intelligence application scenarios.Experimental results show that the proposed mechanisms effectively accelerate DNN inference and can be well adapted to heterogeneous end devices and dynamic networks.The theoretical results and prototype systems in this thesis are conducive to promoting the application of artificial intelligence in edge computing scenarios and provide support for the construction of edge intelligent application services.

Keywords/Search Tags:

Edge intelligence, synergistic inference, multi-exit DNN, model partition, resource allocation

PDF Full Text Request

Related items

1	Real-time Adaptive Deep Inference For Cloud-edge Collaboration
2	Research On The Optimization Scheme Of Inference Stage Based On Model Partition Under Edge Intelligence
3	Research On Edge-device Cooperative Inference Mechanism Based On Neural Network Model Simplification And Partition
4	Research On Resource Allocation And Task Scheduling For Edge Intelligence
5	Edge Intelligence：Research On Cloud-Edge-End DNN Collaborative Inference Acceleration Technology
6	Research On Joint Resource Management Technology For Edge Intelligence
7	Research On DNN Inference Optimization Technology For Resource-Constrained Devices
8	Research On CNN Model Training And Inference Technology For Edge Intelligence
9	Research On Cloud-edge Joint Task Inference And Model Collaborative Training In Edge Intelligence
10	Research On Resource Allocation Method Of Mobile Edge Computing Based On Intelligent Inference