Font Size: a A A

Research On Deep Convolutional Neural Network Architecture Optimization Method And Application

Posted on:2023-03-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:C PengFull Text:PDF
GTID:1522306911980759Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
The continuous improvement of deep neural network performance can largely be attributed to the continuous optimization and rapid development of deep neural network architecture.For a long time,most of the current classical neural architectures are hand-designed by human experts.However,designing neural architecture is a time-consuming and laborintensive task that relies on prior knowledge and criteria.In recent years,the traditional manually designed neural networks have been difficult to meet the needs of various practical applications and performance.In order to solve the above problems,many cutting-edge research works have begun to focus on automatically designing(searching)and optimizing neural architectures through algorithms to reduce the consumption of human and computational resources.However,the current field of automatic design and optimization of neural architectures is still in the early stage of development,and there are still many bottleneck problems: such as low efficiency of search algorithms,high consumption of computing resources,and poor transferability.To address these problems,the main work of this paper revolves around the optimization method of deep neural architectures,and a series of methods and application studies are carried out from two aspects of neural architecture search and model pruning,and the following research results have been achieved:(1)In view of the problems and limitations of the differentiable neural architecture search methods,including poor flexibility of the search space,model collapse phenomenon,low search-evaluation correlation,this paper has analyzed the reasons for these above problems from two aspects of search space and search strategy,and a neural architecture search framework based on differentiable annealing and dynamic pruning is proposed,which first improves the cell-based search space and designed an elastic densely connected global search space.It decouples the depth representation from the weights of candidate operations,avoiding the aggregation phenomenon of skip connections.Then we proposed a progressive search strategy based on group annealing and threshold pruning,which can make the architecture parameters gradually approach the binary distribution as the search progresses,and the weak operations with low weights are gradually pruned,which improves the stability of the search process and reduces the search time cost.In order to improve the computational resource adaptability of the searched architecture,this paper proposed a channel pruning method based on dynamic programming.By gradually pruning redundant channels during the search process,the final architecture can strictly meet the given resource constraints.(2)In view of the difficulties and key problems in the one-shot neural architecture search framework,such as the unstable training process of the super network,low training efficiency,and low performance of the optimal subnetwork in the single-path search space,etc.In this paper,a unified one-shot neural architecture search method based on multi-path training is presented,which adopts a two-stage search framework,and separates the super network training stage and the architecture search stage into two independent steps,resulting in good flexibility and versatility.Then this paper improved the structure of the batch normalization layer in the super network,and proposed a Mixed batch normalization layer structure,which is suitable for the training of multi-path super network,and improves the effectiveness and stability of the training stage.In order to improve the training efficiency of super network,this paper proposed a search space shrinkage strategy based on diversity score,which gradually reduces the search space and improves the search space by guiding the super network to gradually eliminate the inferior candidate operation combinations during training,and improves the search efficiency and the performance of searched architecture.(3)In view of another key problem in the two-stage one-shot neural architecture search framework,that is,under the weight sharing strategy,the real ranking of the sub-network performance is often not well preserved,which will mislead the performance evaluation in the search stage,and make it difficult to effectively find the sub-network with high accuracy.This paper first analyzed the training stage of the super network,and attributed the reason for this phenomenon to the consistency shift in the super network training process,including feature shift and parameter shift.Then this paper proposed a neural architecture search framework based on consistency loss and temporal ensembling,which constructed two super-network,including the teacher super-network and the student super-network.By introducing the consistency loss function based on cross paths,it greatly improves the generalization ability of the super network and reduces the feature shift problem.In addition,the weight of the teacher model is updated by calculating the exponential moving average of the student model,which integrates the historical weight information,so it can reduce the parameter shift phenomenon in the super network training process.(4)In view of the problems and limitations of traditional neural network pruning methods,including high time cost and poor scalability of manual designed importance evaluation indicators,high computational complexity and low efficiency of iterative model pruning,by combining with the idea of neural architecture search,this paper proposes an automatic pruning framework based on weight sharing and evolutionary algorithms,which trains a slimmable super-network through channel grouping and random channel sampling,subnetworks with different channel configurations in the super-network can directly inherited the corresponding weights,and the accuracy of sub-networks of different widths can be evaluated without retraining,thus avoiding the steps of iterative pruning and evaluating the importance of channels,and improving the pruning efficiency.On this basis,this paper uses evolutionary search to realize automatic pruning,which enables the pruned network to achieve an instant balance between accuracy and resource consumption(such as parameter,FLOPs,latency,etc.).(5)On the basis of the neural architecture search and model pruning methods proposed above,this paper further extends these methods to image recognition tasks in real scenes.Specifically,this paper studied how to automatically design the backbone architecture in remote sensing image recognition tasks,and explores how the neural architecture optimization method can play a role and make an impact in practical application scenarios.In order to break through the performance bottleneck of the backbone designed based on natural images in the remote sensing image recognition task,and the difficulty of applying the neural architecture search method to the remote sensing image recognition task,this paper proposed a new design paradigm of the backbone architecture in the remote sensing image recognition task.A one-shot neural architecture search framework based on weight sharing is proposed,which includes three stages of super network pre-training,fine-tuning and backbone architecture search.The proposed framework also integrate the super network training strategies presented above,which further improves the performance of the searched backbone architecture.In addition,this paper also constructs a large-scale remote sensing image dataset by merging multiple public datasets,which alleviates the problem of insufficient pre-training data and improves the generalization ability of the super network on remote sensing image data.
Keywords/Search Tags:Deep learning, Deep convolutional neural network, Automatic machine learning, Neural architecture search, Model pruning, Remote sensing image recognition, Evolutionary algorithm
PDF Full Text Request
Related items