Font Size: a A A

Research On Object Detection Method In Complex Indoor Scene

Posted on:2024-04-07Degree:MasterType:Thesis
Country:ChinaCandidate:J J ChenFull Text:PDF
GTID:2568307079452564Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Object detection in complex indoor scenes aims to locate and classify objects in complex indoor environments.It is widely used in intelligent home,security monitoring,home service robots and other fields.It is the basis of high-level visual tasks such as visual question answering,video description generation and instance segmentation.However,it is a very challenging task because of the background information interference,overlapping occlusion and large scale transformation.The existing indoor object detection methods based on deep learning use convolutional neural network to extract more semantic features,which can achieve higher detection performance compared with the traditional detection method based on manually designed features,but there are still the following problems:(1)There are interference factors such as complex background,target occlusion and scale change in complex indoor scenes,which make the existing object detection methods less robust.(2)Some data are difficult to obtain and the cost of labeling is high,resulting in the widespread long-tail distribution of data sets,resulting in poor detection performance of the model for objects with fewer samples.To solve the above problems,this thesis carries out the following research work from three different levels of data,model and algorithm:(1)At the data level,aiming at the overfitting problem of model training,this thesis studies a sample adaptive mixed data augmentation method based on classification confidence.The data augmentation method proposed in this thesis integrates several types of augmentation methods and estimates the difficulty of samples through the output confidence of the recognizer.Perform data augmentation with stronger transformation degree for simple samples to ensure the diversity of the expanded samples.General data augmentation was performed for difficult samples to avoid semantic inconsistency between the augmented sample and the original sample.To verify the effectiveness of the method presented in this thesis,a dataset of indoor complex scene object detection was constructed,including 25 categories,7607 pictures,and 78086 annotation boxes.In this dataset,the proposed method improves the mAP index of the baseline model by 8.50%.(2)At the model level,aiming at the diversified semantic characterization problems under complex background,object occlusion and scale change,this thesis studies a crossattention mechanism based object detection model of complex indoor scenes.Specifically,the intra-block channel attention mechanism is proposed to improve the effect of feature fusion within the residual block of backbone network,and the cross-layer fusion mechanism is designed to extract the correlation of features of different layers,so as to improve the quality of feature expression.In the data set constructed in this thesis,the mAP index of the proposed method is improved by 1.64% compared with the baseline model.(3)At the algorithm level,aiming at the problem of long-tail distribution in data sets,this thesis studies a few shot object detection algorithm for indoor complex scenes.Specifically,a feature-intensive matching module is proposed to capture the correlation between query graph and support set effectively.A multi-scale semantic information fusion module is designed to fuse the feature graphs generated by the void convolution of different void rates through the attention mechanism,so as to enhance the semantic expression and perception ability of the features.In the data set constructed in this thesis,the mAP index of the proposed method is improved by 9.50% compared with the baseline model.
Keywords/Search Tags:Indoor complex scene, Object detection, Data augmentation, Attention mechanism, Few-shot learning
PDF Full Text Request
Related items