| Broad learning system is a shallow network with clear structure,which is composed of input layer,hidden layer and output layer.The broad network model has fast training speed and it has a good ability of horizontal expansion.There are a variety of incremental algorithms that can quickly update the broad learning model so that it can adapt to many scenarios.However,the broad learning network still has some shortcomings in dealing with complex problems and its performance in practical applications still needs to be improved.The ensemble broad learning method can alleviate the over fitting and instability of the broad network,and improve the generalization performance of the ensemble model.While the diversity between models is the key to the success of ensemble learning.Thus,this thesis studies the ensemble broad learning method from the perspective of diversity enhancement and proposes corresponding improvement methods for the problems faced by the broad network and the ensemble broad learning method.The main research contents are as follows:(1)The robustness and generalization of a single broad network are poor in the face of complex unstable high dimensional data.To solve this problem,an ensemble broad learning method based on input attribute perturbation is proposed from the perspective of data diversity.This method introduces the random discarding technology based on the broad learning network and temporarily discards the random weight according to a certain probability in the training process.Input attributes are disturbed by different weights to increase different observation perspectives of data samples and reduce the risk of model overfitting.Different subnetworks with complementary capabilities are obtained to improve the performance and generalization capability of the ensemble model.Finally,the public data sets such as default and zoo are selected to verify the performance of the model.(2)Training multiple networks independently will make the training time too long and bring a large cost of computing in the process of ensemble.To solve this problem,an ensemble broad learning method based on similarity measure is proposed from the perspective of parameter diversity.This method uses the unique transverse structure of the broad network to combine feature layer with different enhanced node groups to obtain sub-model networks with different hidden layers quickly.The similarity measure regularization term is used to construct a new objective function which is responsible for constraining the output weight,disturbing the model parameters and promoting the diversity of parameters.By optimizing the solution of objective function,models with low similarity are generated successively.This method can not only increase the diversity among models,but also reduce the training time consumption and save the calculation cost,so as to quickly establish an effective ensemble broad model.Finally,the public data sets such as biodeg and seed are selected to verify the performance of the model.(3)The model parameters of training multiple networks need to be stored independently,so the problem of memory consumption is inevitable.However,when the model parameters are too many or the hardware capacity is finite,the application and promotion of the model will be limited.To solve this problem,a structural ensemble broad learning method based on scaling vector is proposed from the perspective of structural diversity in this thesis to reduce the memory footprint of the model while maintaining the good performance of the ensemble broad network.In this method,the cosine distance regularization constraint is introduced into the objective function to train the scaling vectors.The vectors are used to find subnetwork hidden paths in a network to extract different network structures for ensemble.The constraint of memory consumption can be alleviated by saving the scale vector during storage.This method introduces diversity from the perspective of network structure,which can reduce the synergy between nodes and the risk of model over fitting to improve the model performance.Furthermore,it reduces the memory consumption of the ensemble broad model.Finally,the public data sets such as hepatis and hung are selected to verify the performance of the model. |