| The saliency detection task aims to simulate the visual attention mechanism of human eyes,focusing on marking the semantically interesting target area in the scene.It can be used as a preprocessing step for other complex computer vision tasks,and allocates limited computing resources to important targets in the scene.Co-saliency detection across images needs to find foreground objects with similar appearance or the same semantics in a group of images.Currently,they have been widely used in object tracking,quality assessment,co-object segmentation,image matching and other fields.The task of saliency detection is a challenging computer vision task.The salient objects to be predicted usually differ greatly in shape,size,and location.Especially in the case of cluttered background,most saliency detection methods still have problems such as low prediction accuracy,incomplete object area,and blurred boundaries.In order to solve these problems,this thesis starts from the receptive field expansion mechanism and the attention mechanism,and proposes a single-image saliency detection algorithm and a cross-image co-saliency detection algorithm.The main work is as follows:(1)This thesis designs a multi-task learning framework to jointly optimize salient object detection and salient edge detection.This thesis first proposes a hierarchical solution from coarse to fine to solve the problem of boundary blur.The dual-stream information optimization module(DSIO)is designed to cross-optimize saliency object features and edge features in the top-down feature decoding process.Edge features are used to enrich the contour details of the salient object features,and the object features are used to suppress the background interference in the edge features.Then,for the problems of wrong prediction and incomplete prediction object,this thesis designs the group optimization fusion module(GOFM),which can enhance the representation of salient object features and edge features by generating a group of grouping features with gradually increasing receptive fields and realizing the effective fusion of grouping features.In addition,this article uses ASPP module to construct global saliency features to further guide top-down feature fusion.Experimental results show that the proposed algorithm has a higher F-Measure value and a lower MAE value,and shows good performance on different data sets.(2)Most of the existing Co-saliency detection models cannot stably and effectively construct the cross-image attention relationship.For this reason,this thesis designs a group attention module(GAM)based on the Transformer structure to construct the relationship between pixels in a set of image sequences,to assign different co-attention weights to different images.In addition,in order to ensure that the high-level coattention is not diluted,we design the attention retain module(ARM)and the spatial attention module(SAM)to provide co-attention weighting in the high-level and lowlevel feature fusion process respectively.Finally,considering the consistency within the group and the separability across the image group,we additionally designed a embedding loss to learn a more distinctive high-dimensional feature embedding space to distinguish the real co-saliency object from the interfering object.Experimental results show that this method can more accurately detect co-saliency objects across images. |