| With the rapid development of deep learning and convolutional neural networks,saliency detection tasks have been rapidly developed in the image scene.Saliency detection aims to identify the most salient objects from the scene by simulating the human visual attention mechanism.However,existing salient region detection methods can only detect pixels belonging to the salient region to generate a dense binary saliency map,but are unable to discriminate individual instances in salient regions.In the field of saliency detection,salient instance segmentation is considered a next-generation instance-level salient object segmentation task.This task aims to perform a more detailed analysis within the salient region to predict salient instance masks at the instance level.In contrast to the generic instance segmentation task,the salient instance segmentation task is implemented based on salient areas without identifying the classes of instances.Salient instance segmentation is valuable for autonomous driving systems that can analyze key targets in a scene and efficiently filter out distracting information,greatly improving the efficiency of assisted driving.The core of the salient instance segmentation task is to detect the number and location of instances in the salient region and suppress the ineffective information.Background instances usually cause some interference in predicting the number of salient instances.In parallel,the boundaries of salient instances frequently overlap with other instances with similar features,which increases the difficulty of segmenting the location of each instance.Due to the high cost of labeling pixel-level annotations of salient instances,deep networks are unable to be trained using sufficient salient instance segmentation labels.Focusing on salient instance segmentation in autonomous driving systems,a series of methods are proposed to address the key issues in the above-mentioned salient instance segmentation task to advance the development of this novel technology area.The main contributions of the dissertation are summarized as follows:A multitask densely connected neural network is proposed to address the problem that salient instance segmentation methods rely on bounding boxes.The proposed method predicts the number of salient instances and the salient regions respectively using a densely connected subitizing network and a fully convolutional salient region detection network based on the class-agnostic property of salient instances.Then,the salient instances are segmented using an adaptive spectral clustering algorithm based on deep features.This method is the first proposal-free salient instance segmentation method,and it improves the technology of predicting the number of salient instances by subitizing.Compared to 52.58% AP scores achieved by the S4 Net method on the ILSO dataset,this method improved the result to 57.32%.To address the problem of insufficient fully-supervised salient instance labels,a weakly-supervised learning-based model is proposed to segment salient instances.The proposed model is supervised by the combination of salient regions and bounding boxes from the ready-made salient object detection datasets.To locate salient instances more accurately,a global feature refining layer is designed to expand the size of features from the region of interest(Ro I)to the global field in a scene.Moreover,a labeling updating scheme is embedded in the proposed framework to update the weak labels iteratively.Extensive experimental results demonstrate that our method trained by weak labels is competitive with the existing fully-supervised methods,reaching 58.3% AP on the ILSO test set,which is 6.4% higher than the second-ranked RDPNet model.This method alleviates the dependence of salient instance segmentation methods on fully-supervised labeling and significantly reduces the cost of manual labeling.To address the lack of a large-scale salient instance segmentation dataset,a new salient instance segmentation dataset is collected.The dataset is the largest salient instance segmentation dataset available,containing over 10 K samples with elaborately annotated both object-and instance-level labels.Based on solving the problems of predicting the number of salient instances and the insufficient number of training labels,a one-stage efficient method based on Transformer architecture is proposed to address the problem of the two-stage strategy of “detect-then-segment” in existing salient instance segmentation methods.The method designs an orientative query in the Transformer architecture that optimizes the initial object query and efficiently aggregates features of different salient instances,significantly improving the convergence speed during training time.Besides,the method eliminates the need for hand-designed anchors and any post-processing.In addition,a cross fusion module is designed to efficiently fuse the global features in the Transformer encoder and salient query features in the Transformer decoder,improving the accuracy of the segmentation results.This method is the first salient instance segmentation method based on the Transformer architecture.It achieves an AP value of 60.9% on the SOC test set,significantly outperforming other salient instance segmentation methods and surpassing the results of the RDPNet by about 23%.With the above proposed salient instance segmentation model,the problem of predicting the number of salient instances and insufficient dataset size is optimized in the salient instance segmentation area.Based on the large-scale SIS10 K dataset,we propose a one-stage model based on Transformer to improve the performance and efficiency of the salient instance segmentation methods.These contributions provide important technical support for scene recognition of autonomous driving systems. |