| With the advent of the Internet of Everything(Io E)era,the explosive growth in the number of Internet of Things(Io T)devices has generated a large amount of real-time data.Traditional centralized cloud computing methods have been unable to meet the growing computing needs.Edge computing is widely used as a new computing paradigm.At the same time,with the rapid development of Deep Neural Network(DNN)technology,and the integration of artificial intelligence and edge computing,the demand for gradually intelligent edge devices is increasingly strong.However,due to the high resource consumption of the DNN and the limited resources of edge devices,how to deploy it on edge devices is a huge challenge.Because of the above problems,although the current mainstream solution model compression can play a certain acceleration effect,the compressed model cannot dynamically adjust its computation amount according to the different inputs,limiting the model’s performance.In contrast,dynamic inference methods can adjust their computational load dynamically according to different complexity inputs,including network skipping and early exit.However,in dynamic inference methods,it is important to balance the relationship between accuracy and computational efficiency,to make a correct skip strategy and early exit strategy.On the one hand,in the skip method,what to skip,how to skip,and how to skip better are the key issues that need to be solved urgently.On the other hand,in the early exit method,how to improve the performance of shallow branches,and which method to use to exit the network is a big challenge.Given the above challenges and problems,this paper studies the dynamic inference technology of deep neural networks for multi-branch execution.The main work and innovations are as follows:Firstly,this paper proposes a branch execution method ACS(Adaptive Channel Skipping)based on channel skipping.A gating network ACS-GN(Gated Network)with CNN structure is designed,to effectively guide channels to skip.To further reduce the computational overhead of ACS-GN and improve its efficiency,this paper proposes a dynamic group convolution method ACS-DG(Dynamic Grouping).Experimental results show that ACS-GN is an efficient gating network,compared with other gating structures,it is not only suitable for residually connected backbone networks such as Res Net,but also densely connected backbone networks such as Dense Net;compared to other the dynamic grouping convolution method,the ACS-DG method can effectively reduce the computational overhead of the network;a combination of both,ACS improves the model accuracy by 0.068% ~ 1.222% and FLOPs decreases by 38.09% ~ 72.36%,compared with the existing mainstream method IADI.Secondly,this paper proposes a dynamic inference method BEE(Branch Early Exit)based on distillation-assisted branch early exit.A self-distillation method BEE-SDA(Self-Distillation Assisted)is designed,which integrates the branch to guide the last branch and the last branch to guide other branches of the network.BEE-SDA can effectively improve the performance of shallow networks.Based on the improved model performance of BEE-SDA,to make most instances exit the network as early as possible,an early exit method BEE-SS(Softmax Similarity)based on the similarity of branch Softmax output distribution is proposed.The experimental results show that BEE-SDA is an effective self-distillation method,which can improve the model accuracy by 0.01% ~1.21%,compared with other self-distillation methods.Compared with the classic early dropout method Branchy Net,BEE-SS accelerates model inference by 0.2x ~ 1.0x to ensureensureon the premise of ensuring model accuracy. |