| The rapid development of the Internet has provided great convenience for the use of intelligent mobile devices.The types of intelligent mobile devices and the applications implemented on them have made a quantum leap.Due to the rise of short video on the Internet,there are many applications that use continuous video streaming.It can be seen everywhere that people use mobile devices to take photos,or use drones to shoot video,or use wearable devices to help the elderly identify objects.In recent years,the expression and creation of the whole society are all through the video,and the demand of intelligent mobile devices for video stream recognition is increasing sharply.At present,deep neural networks are commonly used to handle continuous video streaming tasks.However,deep neural network requires a lot of resources to operate,and it is very difficult to deploy on the mobile platform.However,using the cloud to process data increases the data transfer time,is not friendly to tasks with real-time requirements,and it also increase the risk of user privacy exposure.An emerging paradigm to solve this problem is to design lightweight neural networks that meet the resource constraints of mobile devices and the target task requirements.First of all,in the face of local video streams,processing methods mainly include inter-frame difference method,background difference method,key frame extraction,etc.,but the above traditional methods are not suitable for deployment in mobile terminals due to the complexity of calculation and implementation process.Second,object recognition on mobile devices through the design of lightweight neural network is the main research direction in recent years,including model compression,manual design of lightweight neural network and neural network architecture search.However,the first two methods rely on manual experience,so the network structure lacks changes.Although the network search has gradually abandoned the influence of artificial,but nowadays there are many kinds of recognition tasks,the existing methods still lack the consideration of the diverse needs of the task,that is,the neural network can not adapt.Based on the analysis of the above methods,this thesis proposes an adaptive lightweight neural network architecture for real-time object recognition in video stream,aiming at how to effectively realize real-time object recognition in resource-limited platform.Firstly,a video frame screening scheme based on motion vector and first view acceleration is proposed to reduce the time delay from the fundamental point of view.Secondly,according to the different requirements of multiple tasks on the equipment for recognition results,multi-objective optimization is adopted to constrain the model from multiple dimensions to meet the requirements of different tasks.Finally,an efficient neural network architecture search algorithm is designed to ensure that the obtained neural network can accomplish real-time object recognition tasks well,including lightweight search space,multi-objective search algorithm and weight reuse to speed up model evaluation.Experimental results show that this work shows competitive results on both public and custom datasets,and achieves real-time object recognition in video stream with high accuracy and low overhead. |