Font Size: a A A

Research On Feature Enhancement And Data Augmentation For Object Detection

Posted on:2022-09-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:H WangFull Text:PDF
GTID:1528306839979689Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Object detection aims at classifying and locating the objects in the image.It needs to simultaneously output the precise category of the object and its corresponding location,and the objects are easily suffered from the influences of the reality world,causing changes of their appearance.These factors increase the difficulties for improving the performance of the object detection and bring the challenges for designing the robust and stable detectors.The core insight for improving object detection performance is to build the robust feature extraction module and enrich diversity data samples,so enhancing the feature representation ability and augmenting data sample diversity are important.Enhancing feature can improve the modeling ability and discriminative information in the detector,which can accurately predict the category and location of the object.While data augmentation can improve the generalization ability of the detector,especially for data with complex morphological changes,showing the stability of the detector.In recent years,with the development of the deep learning methods,the object detection methods based on convolution neutral network with public datasets have been widely studied.However,current detectors still suffer from the bottlenecks of the detection performance,and the feature enhancement and data augmentation are still needed to be further investigated.On one hand,current detectors benefit from the development from hand-craft features to deep networks,and the enhancement of the feature representation ability bring the improvement of the object detection performance.However,simply increasing the depth of the CNN and deploying multi-scale feature fusion and sampling do not take the characteristic of the object detection feature,feature learning manner for subtasks and feature relationship with other computer vision tasks into consideration.On the other hand,current public datasets have limited scenes for each category and suffer from data imbalance issue.Though current data augmentation methods alleviate the dependent issue of dataset,synthetic datasets lack sample diversity,and need external mechanism to guarantee the quality of the data.It still brings challenges for improving object detection performance.To deal with aforementioned issues,this thesis starts from the feature enhancement and data augmentation,designing efficient high-order feature representation method,proposing effective feature disentanglement pipeline and data instance switching augmentation methods.Furthermore,we integrate feature enhancement and data augmentation into one single model for improving performance.The main research contents are summarized as follows.(1)Object detection features need to preserve the location information for operating bounding box regression for the proposals.Although traditional high-order methods can enhance the representation ability of the feature,but it can not fulfill this constraint.We propose Multi-Scale Structural Kernel Representation for Object Detection.We first modify existing feature fusion methods,and obtain the multi-scale feature by fusing the features within and between each convolutional block.This method not only slightly adds additional computation cost but also improves the detection performance.We deploy feature power normalization and polynomial kernel approximation method to obtain high-order feature representation,which consider the geometrical structure and preserve the position information.Meanwhile,it can still guarantee the stability of numerical computation.At last,we add an attention module to decode spatial and channel information for high-order features.Our proposed high-order feature representation method improves the ability of the object detection feature,and outperforms other methods.(2)Object detection consists of two sub-tasks including classification and localization.Current feature learning pipelines deploys shared networks,and they do not realize the feature differences between the translation invariance sub-task classification and translation variance sub-task localization.We propose Sub-Task Feature Learning for One-Stage Object Detection.We first start from the feature differences of classification and localization tasks,and utilize two separated individual backbones to extract specific task features,and insert external classification and localization prediction heads for endto-end specific feature learning.Meanwhile,feature relationship is also considered in our method by implementing a feature interaction module,for accomplishing the object detection task.For guaranteeing the training stability of the network,a cosine annealing learning rate adjustment strategy is applied.We improve the feature ability by feature disentanglement pipeline,and obtain performance improvement on one-stage detectors.(3)Synthetic data generated by cut-paste data augmentation methods for object detection needs to precisely predict the locations of the obj ects,but current data augmentation methods need additional mechanism to modeling the position of the objects.For ensuring the context coherence and quality of the synthetic data,we propose Constrained Online Cut-Paste Data Augmentation for Object Detection.During training period,we switch the objects which come from the different training samples but belong to the same class,for ensuring the coherence consistency of the objects and background images without any external position prediction mechanism.Meanwhile,geometrical consistency constraint is applied for considering the shape difference and scale similarity of the switched objects.By calculating the difference of the original sample and its corresponding synthetic one,a weighted loss is utilized during training for constraining the sample diversity.Data augmentation by using instance switching obtains prominent improvement on different detectors.(4)Current methods do not simultaneously enhance feature and augment data in one single model for further improving the robustness and stability of the detector.We propose Automatic Label Assignment by Semantic Feature Distillation for Object Detection.By adding a weakly-supervised semantic segmentation module,and distilling the output semantic information to the label assignment module in object detection,we take advantage of the attributes of the semantic information which can perceive the foreground and background of the objects,and weight the positive and negative samples in label assignment module.Meanwhile,centering prior weighting based on gaussian function and dynamic positive sample number selection strategy are also used to improve the quality of the positive samples.During training period,data augmentation by instance switching is deployed for improving the diversity of the data.We improve the performance of object detection and semantic segmentation.
Keywords/Search Tags:Object Detection, Deep Learning, Convolution Neural Network, Feature En-hancement, Data Augmentation
PDF Full Text Request
Related items