Font Size: a A A

Research And Implementation Of Synergistic Inference Acceleration For Edge Intelligence Applications

Posted on:2023-04-08Degree:MasterType:Thesis
Country:ChinaCandidate:H T WangFull Text:PDF
GTID:2558307061953669Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of artificial intelligence,deep neural network(DNN),as the representative algorithm,has been widely studied and applied in many intelligent application scenarios,such as image classification,natural language processing,and video analysis.DNN inference in these applications is with the characteristics of task-intensive and computationally intensive,requiring massive resources.However,most end devices are only equipped with weak processing power.Even cloud computing will lead to unacceptable communication latency that severely limits the performance of DNN inference.To address the shortcomings of the existing computing modes,edge intelligence has emerged as a prospective paradigm for accelerating DNN inference in recent years.Based on the advantages of edge computing,edge intelligence migrates intelligent services from the remote cloud to the network edge,improving the response speed and reliability of services.The existing work in edge intelligence mainly focuses on model optimization and synergistic inference optimization.However,the existing methods of model compression are prone to irreversible loss of accuracy.Although the model early exit methods ensure acceptable accuracy,it cannot cope with the diversity of data distribution,which leads to considerable computation redundancy.In terms of synergistic inference optimization,the existing methods require significant data transmission overhead between devices,which suffer from performance deterioration in the dynamic network environment.In addition,the limited edge resources and the heterogeneous end devices put forward tougher challenges for the synergistic inference acceleration.To tackle the above problems,this thesis focuses on the scenarios of edge intelligence and introduces the early exit mechanism of multi-exit DNN into the end-to-edge synergistic inference environment.This thesis achieves DNN inference acceleration through optimizing the exit setting of multi-exit DNN and the offloading decision of inference tasks,which specifically includes three parts:(1)Aiming at the problem of model computational redundancy caused by the diversity of data distribution,a data-aware multi-exit DNN exit selection mechanism is proposed.This mechanism obtains data-aware information by collecting the exit probability distribution of the model’s exits and the calculation amount distribution of the model’s layer.Based on the data-aware information of multi-exit DNN,this mechanism adaptively determines exit settings,including the placement and number of multiple exits,to minimize the computation amount of multi-exit DNN.(2)Regarding the heterogeneity of end devices,network dynamics,and limited edge resources,a model partition and resource allocation joint optimization mechanism for end-to-edge synergistic inference is proposed.This mechanism considers the model characteristics,the network status,and the resource constraints simultaneously to reduce the average execution time of tasks.The joint optimization achieves adaptive slicing models to determine the computing proportion of end devices and edge servers,and then realizes resource allocation to meet the requirements of heterogeneous inference tasks.(3)Based on the above theoretical research,a synergistic inference acceleration prototype system for edge intelligent applications is designed and implemented.This system adopts an end-to-edge synergistic framework which is mainly composed of a model optimization module and a joint decision-making module.This system is finally deployed and evaluated in the edge computing environment.The experimental results verify the performance improvement of this system on edge intelligent applications.To sum up,this thesis conducts theoretical research and system implementation on the DNN inference acceleration in edge intelligence application scenarios.Experimental results show that the proposed mechanisms effectively accelerate DNN inference and can be well adapted to heterogeneous end devices and dynamic networks.The theoretical results and prototype systems in this thesis are conducive to promoting the application of artificial intelligence in edge computing scenarios and provide support for the construction of edge intelligent application services.
Keywords/Search Tags:Edge intelligence, synergistic inference, multi-exit DNN, model partition, resource allocation
PDF Full Text Request
Related items