Font Size: a A A

Local Rademacher Complexity Based Regularization Algorithms

Posted on:2024-06-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z Q LuFull Text:PDF
GTID:1528307292960519Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Neural networks have lately shown impressive performance in sophisticated realworld situations,including image classification,object recognition,and image captioning.With the development of deep learning,the depths and widths of deep learning models have been highly extended.Hence,the volume of variables in the neural network has been exponentially increased.However,given limited training data,the neural network will be easily overfitted.Overfitting means models perform well on training data but perform less well on new inputs.The major strategy for preventing overfitting is regularization.Most of the regularizaiton algorithms are heuristically designed so that their generalities are not good.Since models having weak predictability on new inputs is the representation of bad generalization,analyzing the model’s generalization can theoretically provide thinkings for regularization algorithms.Depending on solid generalization analyses,a handful of good-performance-regularization algorithms have been amended and complemented,so that model’s interpretability has been enhanced.Therefore,neural network models will overcome their black box property and weak robustness to achieve better performance,if the solid theoretical framework is established.This thesis begins with analyses of generalization in order to achieve various networks’ complexity upper bound through rigorous mathematical derivation.Then the new regularization function and optimization strategy are formed based on the derived upper bound of the local Rademacher complexity of the hypothesis set.From theory to algorithm design,this is a reverse thinking of first heuristically designing a regularization algorithm and then seeking feasible theoretical explanations.Firstly,this thesis analyzes the local Rademacher complexity of the hypothesis set in the most basic neural network architecture fully-connected networks.Through probability concentration inequalities and properties of vector space,the upper bound of the local Rademacher complexity is derived in fully-connected networks,including Dropout.The new regularization function is distilled based on the derived upper bound,and a corresponding two-stage optimization strategy is proposed.Experiments on two image classification datasets have proved the effectiveness of the proposed algorithm.Since the more appropriate network structure for image classification tasks is convolutional neural networks,this thesis tries to extend the complexity regularization to convolutional neural networks,including the more suitable algorithm of convolutional neural networks Drop Block.Next,this thesis analyzes the local Rademacher complexity in single-channel convolutional neural networks,which is relatively more simple.After abstracting Drop Block and convolutional neural networks into a mathematical model,the upper bound of the local Rademacher complexity of the hypothesis set and new regularization function is derived through the convolutional inequality.While in the more complex and more common multi-channel convolutional neural networks,the three dimension feature maps and the four dimension kernels are converted into two dimension through block matrices,the upper bound and new regularization function is achieved by the same procedure in the single-channel situation.The proposed algorithm has achieved lower classification error on a bigger image classification dataset.Lastly,inspired by the representation and properties of neural networks’ weight matrices in the two previous studies,and multitask models have various learning mechanisms and network structures,the analyzed multitask model is structure independent.Moreover,the proposed multitask model contains random masks preventing subtasks’ gradient conflicts.The upper bound of the local Rademacher complexity of the hypothesis set is derived through probability concentration inequalities and properties of norms,then it is simplified into the new regularization function.Compared to other regularization algorithms,the proposed algorithm achieves better performance in multitask classification tasks.
Keywords/Search Tags:Neural Networks, Generalization, Regularization, Rademacher Complexity, Multitask Learning
PDF Full Text Request
Related items