| Deep learning methods,including convolutional neural network models,are an effective solution to many application problems in actual Io T scenarios.However,deep learning models are characterized by a large number of parameters and high computational costs.In a resource-constrained Io T environment,model deployment and inference face many challenges.In related research and practical applications,the computing paradigm of edge-cloud collaboration is gradually emerging.The computing paradigm of edge-cloud collaboration completes computing tasks together through reasonable cooperation between the edge and the cloud.The classic convolutional neural network in the deep learning model has a good hierarchical structure and decoupling.This provides the possibility for the application of the edgecloud collaborative computing paradigm in the inference and execution of models.However,how to effectively use the edge-cloud collaboration paradigm to promote the application of deep learning models still faces many difficulties in actual Io T scenarios.The edge-cloud collaborative inference acceleration of deep learning models is one of the research hotspots.In the research direction of edge-cloud collaborative inference acceleration of deep learning models,there have been many effective related researches,and many effective methods have also been proposed.Such as model partitioning,quantization compression,and bottleneck structure embedding.Relevant studies have demonstrated the effectiveness of the corresponding method.However,the application of these methods will have an impact on the performance accuracy of the model.How to balance the acceleration effect and the model performance accuracy,and how to make an appropriate choice from many feasible acceleration strategies are important issues that need to be solved urgently.How to solve the above problems is one of the research contents of this subject.This topic integrates the above collaborative inference acceleration methods as the basic edge-cloud collaborative inference acceleration strategy.Then,aiming at the optimal choice of the fusion edge-cloud collaborative inference acceleration strategy,an optimal strategy search algorithm based on simulated annealing is proposed.The effectiveness of the algorithm is verified by experiments: only need to search for part of the collaborative inference acceleration strategy,the optimal strategy under the corresponding evaluation function can be obtained,which greatly reduces the computational cost of model training,measurement analysis,etc.On the other hand,in the actual Io T scenario,how to dynamically adjust the collaborative inference acceleration strategy according to changes in environmental factors is an important problem that needs to be solved.In order to solve this problem,and combined with the aforementioned optimal strategy search,this topic proposes an edge-cloud collaborative inference acceleration framework for deep learning models.The practical application of the edge-cloud collaborative inference acceleration method is improved in a modular and framed way,and the optimal selection of the collaborative inference acceleration strategy in the offline phase and the dynamic scheduling of the strategy in the online running phase are solved.In the experimental part,the improvement of the overall effect of the dynamic scheduling method is verified through relevant experiments,and then relevant experiments are carried out on the used edge-cloud collaborative inference acceleration method.The effectiveness of the method is verified by experiments,and the problems and shortcomings of the corresponding methods are analyzed,which proves the necessity of the optimal policy search algorithm of the framework and the dynamic scheduling of policies. |