Font Size: a A A

Deep Learning-Based Few-Shot Detection And Segmentation For Prohibited Items

Posted on:2024-04-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:K J LiuFull Text:PDF
GTID:1521307301458824Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The safety issues in crowded public places such as subway stations and airports have received widespread attention,and scanning luggage and packages through X-ray security machines has become a widely used method in public safety inspections.For the X-ray images scanned by the security inspection machine,the prohibited items in the images are mainly identified manually.However,the effectiveness and quality of manual identification depend on the ability of security personnel.During manual identification,security personnel always stare at the screen,which is labor-intensive and prone to fatigue,further leading to missed and false detections.Therefore,using artificial intelligence technology to automatically detect and identify prohibited items in luggage or packages is an urgent need.In order to achieve automatic identification of prohibited items,early researches used machine learning algorithms based on hand-crafted features and classifiers to identify prohibited items.However,this method has low recognition accuracy.Therefore,there is an urgent need to efficiently and accurately identify prohibited items.In recent years,the rapid development of deep learning has driven progress in the field of computer vision.However,deep learning methods require a sufficient number of training samples to achieve good model generalization ability.For prohibited items recognition tasks,X-ray images containing prohibited items are extremely scarce,thus limiting the application of deep learning in the field of prohibited item recognition.Therefore,reducing the dependence of deep learning models on a large number of samples has become particularly important.Fewshot learning methods aim to use a small number of samples to train deep learning models and predict unknown class samples,where labeled samples are called support samples and the samples to be recognized are called query samples.Current research on the identification of prohibited items based on few-shot learning is not sufficient.For example,due to the X-ray penetrability,objects in X-ray images can retain their shape information well even if they are occluded,but prohibited items inspection methods based on few-shot learning have not effectively applied edge information to inspection tasks;In the process of training a model using few-shot learning methods,support samples from the input image are usually used to predict query samples.The performance of few-shot learning methods depend on support samples,and the quality of support samples can lead to model instability and performance degradation;In addition,few-shot learning methods are mostly full-supervised and less robust to the prohibited items categories that did not appear during training process.To solve these problems,by utilizing the penetrability of X-ray images,ensuring the prototype consistency of few-shot learning model,enhancing image details,and improving the generalization of few-shot learning model,this paper combines few-shot learning algorithms to study object detection and object segmentation tasks of prohibited items inspection in X-ray image.Object detection aim to output the positions of all targets in the image and the category to which the objects belong.Object segmentation classify each pixel in the image and extract the category pixels of interest.Object detection methods usually provide the predicted location and predicted category of prohibited items,while object segmentation can make finer predictions which effectively predict targets pixel by pixel while outputting high-quality segmentation results.In general,focusing on X-ray images with prohibited items,this paper is based on deep learning,combined with few-shot learning algorithms to study the inspection of prohibited items.The main contents and novel works are as follows:1.To fully utilize the shape information of objects in occluded X-ray images with X-ray penetrability and increase the stability of few-shot learning model,a few-shot detection method(RVVi T)for prohibited items inspection based on multi-scale edge monitoring network and prototype consistency is presented.RVVi T adopts Transformer encoder to generate highlevel semantic features that contain global information.Considering X-ray penetrability that objects in X-ray images can retain their shape information well even if they are severely occluded,this paper also designs a new Multi-Scale Edge Monitoring Network(MSEMN)for extracting and enhancing finer details,and integrate them into global information to obtain enhanced image features.To further improve the stability of few-shot detection model and ensure prototype consistency between the support sample and the query sample,RVVi T designs a Reverse Validation Strategy(RVS)to assist training.In the 1-/5-shot settings,the proposed RVVi T achieved m AP values of 61.2% and 68.5% respectively on Xray-PI dataset,which is the best performance for the same period.2.To improve the ability of semantic segmentation models to capture detailed and global information,a few-shot segmentation method(EFANet)for prohibited items inspection based on differential dual branch network and parallel spatial attention is proposed.Current segmentation methods often directly output high-level semantic information for speeding up model inference.However,both low-level details and mid-level semantics are essential for the segmentation task,and the lack of low-and mid-level features can lead to a considerable decrease in accuracy.Furthermore,such a structure fails to capture global contextual prior attention information,and the extracted features lack context semantic information.To address these issues above,a Differential Dual Branch Network(DDBN)is designed,which uses a Detail-Aware Module(DAM)to capture low-level information in X-ray images.In addition,to filter out more critical information and suppress unnecessary features in the fusion of high-level features and detailed features,the Parallel Spatial Attention Module(PSAM)is introduced to fuse the detailed features and high-level features of each stage and enhance the semantics information of each pixel by generating global features.In the 1-/5-shot settings,the proposed EFANet achieved mean Io U values of 61.7% and 70.2% respectively on Xray-PI dataset,which is the best performance for the same period.3.To solve the problem that current full-supervised methods are less robust to the prohibited items categories,a self-supervised few-shot segmentation method(PPNet)for prohibited items inspection based on cross-image block relationship modeling is presented.Few-shot object segmentation aims to learn from limited samples and assign a class label to each image pixel.However,current few-shot learning methods are mostly full-supervised and less robust to the prohibited items categories that did not appear during training process.To address this issue,we design a self-supervised embedded network for cross-image block relational modeling as the base learner to orderly predict the feature vectors of“future”image patches on unlabeled examples.Model trained through the self-supervised embedded network can capture the underlying latent generative factors of the target objects.Considering that the quality of support samples may greatly interfere with segmentation results,leading to instability of few-shot segmentation model,PPNet also utilizes the Reverse Validation Strategy(RVS)to reduce the impact of support sample quality on segmentation results,further improving the segmentation performance of the model.In the 1-/5-shot settings,the proposed PPNet achieved mean Io U values of 64.5% and 72.7% respectively on Xray-PI dataset.Compared with EFANet,PPNet achieved a performance improvement of 2.8% and 2.5% respectively,which is the best performance during the same period.4.To address the issue of appearance gaps between support and query samples in few-shot segmentation tasks,a self-supervised few-shot segmentation method(SPNet)for prohibited items inspection based on self-adaptive prototype module is presented.Due to the appearance gaps between the support sample and the query sample,only a portion of the regions in the query sample may match the support sample during the process of matching query pixels using the support prototype.Considering that compared to pixels belonging to different objects,pixels belonging to the same object have the highest similarity,this paper proposes a self-adaptive prototype module to solve the above problem,which uses query information as an auxiliary means for self matching.In addition,SPNet also utilizes the Reverse Validation to assist training,further improving the segmentation performance of the model.In the 1-/5-shot settings,the proposed SPNet achieved mean Io U values of 66.2% and 74.3%respectively on Xray-PI dataset.Compared with EFANet and PPNet,performance improvements of 4.5% and 1.7%(1-shot),4.1% and 1.6%(5-shot)were achieved respectively,which is the best performance during the same period.
Keywords/Search Tags:X-ray Images, Prohibited Items Inspection, Few-Shot Learning, Edge Detection, Self-Supervised Learning, Attention Mechanism
PDF Full Text Request
Related items