| Neural network models have been widely applied in various real-world tasks of many fields,e.g.,autonomous driving,face recognition and machine translation.However,the deployment of neural networks on mobile devices still poses challenges due to limited computational resource.The performance of neural networks is primarily affected by the quality of data,the efficiency of algorithm and available computational resource.Therefore,it is vital to develop efficient lightweight neural network algorithms,and swiftly filter high-quality data for training while working with limited computational resource.This thesis aims to address the challenge of achieving high recognition accuracy and high training efficiency for lightweight neural networks in scenarios under limited resource.To achieve this,we primarily analyze the design of lightweight neural network with excellent data mining capability.The main problems include: 1)Low model recognition accuracy: how to design a simple and efficient lightweight neural network,yet capable of improving feature extraction capability and achieving high accuracy,all while maintaining the original computational complexity? 2)Poor model training efficiency: how to effectively screen the most informative samples from datasets while operating under limited computational resource,in order to enhance the model training efficiency? Based on the above problems,the key contributions can be summarized as follows:(1)A lightweight neural network module based on mixed attention mechanism-Lightweight Attention Module(LAM)is proposed.While many works focus on designing neural network models that utilize attention mechanisms for feature extraction,few consider the combination of lightweight model with attention mechanism due to the complexity of the architecture in scenarios with limited computational resource.To solve this problem,this thesis proposes LAM to efficiently integrate these channel and spatial attention mechanisms.Specifically,we first use element-wise addition and smaller convolutional kernels in the spatial module,avoiding the vanishing gradient problem.Besides,we replace the multi-layer perceptron(MLP)layer with squeeze-and-excitation layers in the channel module,alleviating the problem of channel dependencies.Finally,we adopt a parallel mechanism to coordinate these two attention modules with low computational complexity.Experimental results on benchmark datasets demonstrate that,LAM successfully combines the characteristics of both channel and spatial attention without introducing extra network complexity.Furthermore,LAM effectively extracts useful information from feature maps,thereby improving the accuracy of lightweight models.(2)A data selection algorithm for lightweight neural networks-Lightweight Data Selection(LDS)is proposed.Existing data selection methods propose to increase the proportion of feature learning in model training by relabeling or reweighting samples with rich information in dataset.However,these methods have high computational complexity and require additional expert knowledge,which render them unsuitable for implementation in lightweight networks.To solve this problem,this thesis proposes a data selection method for lightweight neural networks.This method builds a scoring model that enables the rapid evaluation of batches of samples suitable for training.Initially,the parameters of the lightweight scoring model are randomly initialized,and all samples are evaluated through forward inference.The training samples are then selected based on their scores.The lightweight classification network is subsequently trained using high-scoring samples,and the scoring model and classification network are updated based on accuracy and loss.As the data selection step is a discrete process,this thesis proposes the policy gradient algorithm to address the issue of discrete numerical optimization.Additionally,the limited memory quasi-method algorithm is used to train the models,thus reducing the training epochs and further improving the training speed.Experimental results demonstrate that the LDS algorithm effectively utilizes the characteristics of lightweight networks to quickly screen good samples,which reduces the amount of training data and epochs needed,thus resulting in improved training efficiency. |