| Deep Neural Network(DNN)play a critical role in modern intelligent mobile applications,such as speech recognition and natural language processing.The applications based on DNN typically require a large amount of computation,however,current mobile device processors do not have enough performance to support the execution of these applications.The traditional approach is to use cloud services to transfer data to cloud servers for computation,but this leads to high application latency and communication overhead.In recent years,edge intelligence has been proposed to deploy artificial intelligence algorithms on edge devices,enabling intelligent processing at the edge of the network,thus alleviating the performance bottleneck of cloud servers.In the process of implementing distributed DNN inference using edge intelligence,how to achieve reasonable partitioning of neural network layers has become one of the hot research topics in edge intelligence.Aiming to the insufficiencies of most existing model segmentation methods,which only perform a single partition between the server and the mobile device,and do not take energy consumption into fully consideration,research on a distributed deep neural network model inference mechanism based on edge intelligence is conducted by this thesis.Firstly,the advantages of edge intelligence and related technologies in optimizing distributed DNN model inference are studied,and the basic elements and characteristics of the distributed DNN model inference framework based on edge intelligence are analyzed by this thesis.A new model for hierarchical partitioning and distributed inference of DNN is proposed,which lays the foundation for subsequent research on specific inference optimization and scheduling methods.On this basis,a collaborative inference optimization and scheduling method for DNN based on edge intelligence is proposed.An optimization problem for distributed DNN inference tasks based on edge intelligence is first studied and proposed,with the objective function of minimizing the inference running time of DNN.To solve this optimization problem,a distributed DNN inference task optimization and scheduling algorithm,which can partition the network layers of DNN multiple times to obtain the device selection queue that minimizes the inference time of each layer,is proposed.Both theoretical proof and experimental verification showed that the proposed algorithm can achieve shorter inference running time.Furthermore,in this thesis,the impact of energy consumption on the operation of DNN inference tasks is considered.On the basis of the edge intelligence-based DNN collaborative inference model,an energy-constrained distributed DNN inference model and an optimization and scheduling algorithm for energy-constrained distributed DNN inference tasks were extended and proposed.The algorithm employs a genetic algorithm,in which the device selection queue serves as the chromosome and the fitness function is determined by the inference running time and energy consumption penalty.This approach aims to solve the new problem of energy-constrained distributed DNN inference tasks.Experimental results showed that the algorithm achieved shorter inference running time under the new constraint.The research results of this thesis can provide new ideas for optimizing DNN inference based on edge intelligence and can be applied in practice with high theoretical value and broad application prospects. |