Font Size: a A A

Object Detection Technology Research Based On Weakly Supervision Learning

Posted on:2020-03-14Degree:MasterType:Thesis
Country:ChinaCandidate:H Y WangFull Text:PDF
GTID:2428330596976037Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the development of Internet and information technology,the visual information generated by modern society is increasing.How to make computers "understand" this visual information and assist the development of related industries is a key issue to be solved urgently.Therefore,computer vision based on deep learning has gradually become a research hotspot in both academia and industry.Among them,face recognition,video surveillance,object detection,Internet image content review,biometrics and other technologies have been widely used in various industries.Object detection,as a fundamental theoretical study,is a prerequisite for many advanced vision applications,such as autonomous driving,security monitoring,and medical imaging diagnostics.Thesis proposed the core problem of how to improve the performance of object detection model based on deep learning.Therefore,a new idea of object detection based on multi-scale feature fusion,contextual information and occlusion-aware is proposed.The main contents of thesis are as follows:To effectively use the spatial information and context information in the network,thesis proposed a novel object detector based on multi-scale fusion and contextual information.Most object detection models currently use the Convolutional Neural Network(CNN)to extract image features.However,how to make full use of spatial and contextual information of images is a challenge.In thesis,different feature extraction mechanisms are designed for the two features.For spatial information,thesis combines feature maps of different levels from feature pyramid,which combines the detailed information of the lower layer and the semantic information of the upper layer.For context information,thesis exploits contextual information by stacking multi-region feature maps.The feature representation is sent to the detection network for classification and location regression.The proposed model has been tested and compared on the PASCAL VOC and MS COCO datasets.The results show that the proposed model performs better,especially for small object detection.At the same time,we also propose a novel analysis method which based on different combinations.It can be intuitively seen that the two mechanisms proposed in thesis have greatly improved the overall performance of the model.To meet the high requirement of object detector for real word occlusions,it is not enough to use only a rich feature extraction mechanism.By analyzing the relationship between convolutional neural networks and high discriminative features in images,thesis proposed a novel model based on weakly-supervised location and occlusion-aware.This model generates some hard examples in training by weakly supervised learning and trains in end-to-end.In order to generate suitable hard examples,thesis proposed a mask generator,which localize regions with strong response in the feature map by weakly supervised learning,and occluded these regions to obtain hard examples.For the PASCAL VOC and MS COCO datasets,this model achieves superior performance compared to previous object detectors.The experimental results confirm that our framework can improve the effectiveness for object detection.
Keywords/Search Tags:Deep Learning, Object Detection, Feature Extraction, Weakly-supervised Learning, Occlusion-aware
PDF Full Text Request
Related items