| Accurate and rapid extraction of key contents and objects of interest from complex scenes is significant in applications such as military reconnaissance,urban security surveillance,and national security monitoring.However,the brief analysis and intelligent understanding of massive visual information place higher demands on algorithm performance,especially when facing complex scenes containing cluttered backgrounds,low contrast,multiple categories,and multi-scale objects,coupled with the limitation of nonlinear representation learning capability,conventional algorithms or machine learning models are challenging to be effective.As the mainstream direction of computer vision and the primary research based on visual content analysis,visual saliency detection aims to simulate the scene perception ability and active selection mechanism of human visual attention to achieve fast screening and extraction of essential contents and key information in the scene.Based on it,this dissertation conducts a series of research on visual saliency-based object detection methods for complex scenes from the perspectives of model structure design,supervision strategy optimization,deep feature integration,and weakly supervised learning,using optical remote sensing,natural images,multimodal and low-contrast scene images as the research objects.The research content and innovation of this dissertation are summarized as follows:1.For the problem of clutter background interference and distinction in complex scenes,an optical remote sensing salient object detection method based on contrast-weighted dictionary learning is proposed.An online discriminative dictionary learning is realized by introducing the sample contrast into the dictionary learning process;based on the discriminative dictionary,a novel saliency criterion is designed by combining sparse representation coefficients and reconstruction errors to improve the discriminative ability of salient objects in remote sensing images.This method solves the problem that the salient object detection algorithm based on sparse representation is limited by the wrong background region sampling sensitive to non-Gaussian noise.Experiments on the constructed remote sensing dataset and publicly available natural image dataset show that the proposed method can better suppress background interference and highlight the integrity of salient objects.2.Aiming at the problems of inaccurate object localization and rough boundary in high-resolution remote sensing image saliency detection,a method of optical remote sensing image salient object detection based on semantic guidance and attention refinement is proposed.The designed semantic-guided decoder achieves coarse but accurate object localization by aggregating high-level features,and the integrated semantic information guides the fusion of low-level features and gradually refines the object edges by a topdown way;the constructed detection model performs supervised learning in an end-toend manner and outputs high-quality predicted saliency maps.The experimental results with the mainstream 14 advanced methods show that the proposed method achieves the best results in the main performance evaluation metrics.The experimental example of remote sensing object detection further demonstrates that the method has the potential for application in intelligent surveillance of the ground.3.To enhance the detection capability of salient objects in scenes with illumination changes or cloud interference,a multi-level interactive learning method for detecting salient objects in multi-modal scenes is proposed.The designed cross-modal refinement module integrates multi-modal(depth and infrared image)features;the multi-level fusion module is used to achieve feature refinement and reuse in a low-up level-by-level feature layer integration with feedback feature correction;the cross-modal features of different levels are fused to output the final prediction in a pyramid detection manner.This method constructs a unified salient object detection architecture for RGB-D and RGBT data scenes,and achieves the main metric performance advantage compared with the mainstream 14 methods.Meanwhile,the experimental examples of open-water ship detection using RGB-D and scene glass area detection using RGB-T further demonstrate the generalization capability of the proposed model.4.Analyzing the intrinsic connection between salient object detection and camouflage analysis tasks,a visual Transformer salient object detection method for camouflage analysis is proposed.The visual Transformer is used to encode and model the global context of the camouflage;the non-local token enhancement module is constructed by nonlocal operations and graph convolution to enhance the local representation of the Transformer;and the paired tokens are aggregated in the decoder by a layer-by-layer shrinking pyramid structure to mine and accumulate effective details and semantic features.This method solves the problems of low efficiency of local modeling of Transformer-based methods and the limitation of feature aggregation in decoder.Experiments show that the proposed method achieves superior performance over 23 state-of-the-art methods on a typical camouflaged analysis dataset.Experiments on remote sensing and natural image salient object detection datasets show that the proposed method has excellent model generalization capability.5.To achieve salient object detection in optical remote sensing images from sparse annotation,a weakly supervised optical remote sensing object detection method based on scribble annotation is proposed.Reliable boundary(pseudo)labels are generated by a classification network with class activation mapping;a boundary-aware module is designed to extract object boundary information from shallow features and input images;and the boundary information is combined with the initial saliency map generated by a dense aggregation strategy to guide saliency prediction in the decoder network.To address the lack of dataset,a graffiti-labeled remote sensing image salient object detection dataset is constructed and experimentally verified to achieve a lead over existing algorithms in the main performance evaluation metrics.This method is the first to use a weakly supervised method based on graffiti annotation to solve the object boundary prediction ambiguity problem,which breaks through the usual way of using full supervision for remote sensing saliency detection. |