Font Size: a A A

Research On Scene Recognition Algorithms Based On Deep Learning

Posted on:2024-08-15Degree:MasterType:Thesis
Country:ChinaCandidate:X ShaoFull Text:PDF
GTID:2568307079473014Subject:Electronic information
Abstract/Summary:PDF Full Text Request
Scene recognition is currently a fundamental computer vision task.Convolutional Neural Networks(CNNs)have significantly boosted performance in scene recognition,albeit it is still far below from other recognition tasks(e.g.,object or image recognition).This may be due to the ambiguity between classes: images consist of multiple layers of information such as objects,global layout information,background environment information,and spatial location relationships of individual objects and images of multiple scene categories may share similar objects.High variability between scenes of the same type and high similarity between scenes,both of which have a significant impact on the accuracy of scene recognition.In this paper,we investigate deep learning based scene recognition algorithms,focusing on the impact of attention mechanisms and multimodal feature fusion on scene recognition.The main tasks are as follows:Firstly,we propose a scene recognition algorithm based on an improved Res Net using coordinate attention mechanism.It captures not only cross-channel but also directionaware and position-sensitive information,which helps model to more accurately locate and recognize the objects of interest.With the joint supervision of softmax loss and center loss,we compensate for the poor clustering ability of features by cross-entropy loss,and effectively close the distance between similar features and expand the distance between different classes of features.Secondly,we investigate the impact of semantic features on scene recognition.Contextual information,in the form of a semantic segmentation,by using information encoded in the semantic representation,is used as a gatekeeper for features extracted from RGB images: collections of scene objects and things,and their relative positions.We refine the semantic segmentation network with a dilated convolution module to obtain more accurate semantic segmentation results.Meanwhile,we use a convolutional neural network with several layers as a semantic relation extraction network to obtain semantic features for scene recognition.Thirdly,we describe a approach for scene recognition based on an end-to-end multimodal CNN.we construct an attention-based feature fusion model that combines image and context information by means of an attention module.Semantic features are used to gate features of the Image RGB Branch,which results in the reinforcement of the learning of relevant context information by changing the focus of attention towards humanaccountable concepts indicative of scene classes.Algorithm focuses the network on objects relevant to recognition,guiding correct recognition judgments and further improving the accuracy of recognition of confusing categories.Finally,we validate the effectiveness of the proposed method on the MIT Indoor 67 and Places365 datasets.
Keywords/Search Tags:Scene Recognition, Deep Learning, Attention Mechanism, Feature Fusion
PDF Full Text Request
Related items