| Considering the importance of maritime territory to the country,it is of great significance to complete the sea surface target detection accurately and efficiently in terms of economy and military.At present,some scholars have carried out a series of research on this task,but most of them are based on single-source images,and the proposed algorithm has weak scene adaptability with poor anti-interference ability,which can not meet the needs of detection system.Therefore,how to effectively use multi-source information and accurately recognitize and locate ship targets in complex sea sky scenes is the main research content of this thesis.Due to the problem that single band infrared image is easy to be disturbed by complex scenes,resulting in the decline of imaging quality and affecting the effect of detection system,this thesis selects dual band infrared images,including medium wave and long wave,as the input.Based on the analysis of their characteristics and imaging principle,this thesis first designs a pixel-level image fusion algorithm based on attention mechanism.The algorithm can effectively use the information complementarity between multi-source images to output a fusion image which is more robust to time and weather,so as to improve the significance of target and suppress the interference of background clutter.The model combines the advantages of the two existing fusion method,and its attention fusion part can effectively realize the overall perception of the image and ensure the fusion effect.In addition,two additional auxiliary training modules are designed to reduce the difficulty of unsupervised training and further ensure the learning effect.Based on the multi-band fusion images output by the above method,this thesis realizes the ships target detection task through an independent object detection model.Aiming at the problems that the existing neural network object detection algorithms have poor detection effect of small target and are easy to be disturbed by the scene,this thesis designs a multi-task perception model for the sea-sky scene.Beside the ship detection task,this thesis introduces a sea-sky line positioning task.By realizing the correlation of the two tasks by the corresponding feature level and task level information interaction modules in the model,our model uses the sea sky line positioning results as a priori information to suppresses the irrelevant regional interference so as to improve the ship target detection effect.On this basis,the general detection framework is adjusted to improve its adaptability to small-scale targets according to the characteristics of ship targets in our dataset.Through the above cascade workflow,i.e.fusion model and perception model,the required task in this thesis,sea surface ship target detection by medium wave and long wave image,can be completed in some extent.However,considering the possible information loss and additional time-consuming in the working process of multiple models,this thesis proposes an algorithm that can complete multi-source information fusion and target detection tasks at the same time.Based on the analysis of the fitness between the attention operation in Transformer and the task of this thesis,this thesis designs an end-to-end medium wave and long wave sea surface ship detection algorithm based on this architecture.The model takes the original medium wave and long wave images as input,extracts and interacts the dual light information through the internal attention calculation,and directly outputs the final detection results.At the same time,in order to effectively solve the problem of difficulty of training Transformer on small-scale dataset,this thesis designs a targeted self-supervision pretraining method to further improve the detection effect of end-to-end model. |