Research On Object Detection Method Based On Deep Learning And Relational Reasoning

Posted on:2024-04-06

Degree:Master

Type:Thesis

Country:China

Candidate:Z H Li

Full Text:PDF

GTID:2568307157971429

Subject:Electronic information technology

Abstract/Summary:

PDF Full Text Request

With the continuous development of technology,computer vision technology has gradually penetrated into the daily life of the general public.As a hot research direction,target detection technology has been widely used in face detection,automatic driving,industrial production,aerospace and other fields.Target detection mainly classifies several different objects contained in a picture and gives its corresponding bounding box.At the same time,many other tasks in computer vision depend on the output of target detection algorithms,so it is necessary to improve the accuracy of target detection algorithms.However,existing target detection algorithms are often limited to processing each target region separately and usually lack the ability to reason using the relationships between targets.While targets in images usually contain rich relational information,ignoring such information will affect the accuracy and efficiency of target detection,making target detection methods inherently limited.Therefore,in order to improve the accuracy of target detection,this thesis proposes a multimodal relational inference target detection method based on Transformer and graph convolutional network and a relational fusion target detection method based on attention mechanism and similarity matching,inspired by human recognition and inference process.The main research contents of this thesis are as follows:(1)To address the characteristic that the targets to be detected by current target detection algorithms are often strongly related to each other and they are not absolutely segmented from the contextual environment,this thesis proposes a multimodal relational inference target detection method based on Transformer and graph convolutional networks.The method firstly,by introducing textual modal relations of image description algorithms,and then utilizing the implicit associative auxiliary information in images to assist the target detection task,by modeling the relationships between targets.Secondly,the features of visual modality are enhanced and enriched by drawing on an NLP model based on a multi-headed attention mechanism.Finally,the features with enhanced target information are passed into the classification and regression sub-network for training.By this way,the target detection algorithm can not only correct the original wrong and missed targets,but also realize the accurate recognition of some small target objects.(2)To address the problem that multimodal relational inference networks do not introduce human recognition reasoning,this thesis proposes a relational fusion target detection method based on attention mechanism and similarity matching on top of the above method.Firstly,the contextual a priori information is obtained by constructing a knowledge graph,through which the human brain’s storage of experience and knowledge is simulated.Second,the knowledge graph is optimized to reduce the redundant edges in order to address the problem that the knowledge graph is relatively large.Finally,a similarity matching module between the knowledge graph and the region of interest is introduced to implement human visual inference and further enhance the feature representation,which is passed into the detection head network for training.In this way,the performance of the target detector is enhanced by exploiting the role of supervision and bias correction of prior knowledge information.In this thesis,experimental comparisons and conclusion analysis of the proposed method are conducted using MS COCO dataset and PASCAL VOC dataset.The experiments show that both target detection methods proposed in this thesis improve in detection accuracy,and the detection accuracy of the improved algorithm is improved by 0.7% and 1.2%,respectively,compared to the Faster R-CNN network.

Keywords/Search Tags:

object detection, relational reasoning, multimodal relations, graph convolutional networks, attention mechanism, relational fusion

PDF Full Text Request

Related items

1	Research On Audio-Visual Emotion Recognition Based On Relational Reasoning And Attention Mechanism
2	Research And Implementation Of Scene Graph Generation Algorithm Based On Attention Mechanism
3	Research On Knowledge Graph Link Prediction Method Based On Graph Convolutional Neural Networ
4	Research On Situational Reasoning Question Answer Method Based On Deep Learning
5	Algorithm Optimization And System Implementation Of Knowledge Graph Relational Reasoning Based On Deep Learning
6	Research On Visual Question Answering Based On Deep Neural Network
7	Multimodal Dialog System:Relational Graph-based Context-aware Question Understanding
8	Research On Robust Person Re-Identification Algorithm Based On Graph Representation
9	A Research Of Relational Inference Algorithm Based On Knowledge Graph
10	Probabilistic reasoning utilizing relational database techniques