Font Size: a A A

Remote Language Image Natural Language Description Generation Model Based On Attention Mechanism And Deep Learning

Posted on:2020-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:J W ChenFull Text:PDF
GTID:2392330575477315Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Imaging satellite sensors have evolved rapidly over the past few decades.Among them,optical,synthetic aperture radar and other sensors have obtained a large amount of remote sensing data,providing a source of data for millions of remote sensing systems.Compared with traditional satellite sensors,today’s satellite remote sensors can not only obtain higher resolution remote sensing images,but also gradually have the ability to stand by and transmit at any time.Remote sensing images at high resolutions carry more information than previous low-resolution remote sensing images,and play a very important role in national security,environmental pollution detection,and natural disaster prevention.Although the size and information content of remote sensing images are constantly increasing,due to technical reasons,the information we obtain from remote sensing images has not exploded.How to obtain as much information as possible in remote sensing images is an important research direction in the field of remote sensing.Among the many techniques for extracting remote sensing image information,the natural language technology that can be understood by the automatic generation of remote sensing images has received extensive attention in the field of remote sensing.The traditional remote sensing image subtitles work mainly consists of two parts.The first part,multi-target detection of remote sensing images.The second part deals with tag information into natural language.The traditional remote sensing image natural language description work has certain limitations:(1)Remote sensing pictures generally have large pixels and small target proportion.In the training phase of the convolutional neural network,only the labels of the target feature information are focused,resulting in a lot of background information being ignored.(2)Because the traditional remote sensing image natural language processing adopts the classic paradigm framework,this framework is designed based on the template method,has language restrictions,is not flexible and user-friendly,and has a lot of information omitted in the patterning,resulting in the loss of information.(3)Since the traditional remote sensing image subtitle work consists of two independent modules,the corresponding data labeling must be made for each independent module during training.The consequence of this is that the labor is greatly increased.Because of the high cost of labor,there is nothing to do when faced with big data training.At the same time,because each module is independent of each other,the resulting model does not best match the data.In recent years,with the rise of deep learning,neural networks based on attention mechanisms have become a hot topic in recent neural network research.The idea of the attention mechanism is to increase the weight of useful information,thereby allowing the task processing system to focus more on finding useful information related to the current output in the input data,thereby improving the quality of the output.This paper analyzes the working principle and characteristics of the traditional remote sensing image natural language description generation model.By combining the latest deep learning technology,a new natural language description generation model of remote sensing image is constructed.The improved model significantly improves the accuracy and completeness of the natural language description of remote sensing images.The main innovations of this paper are as follows:1.In order to solve the problem of information loss in the traditional remote sensing image target detection process.This paper builds a new network structure based on deep learning and attention mechanism,which we call a dense positioning layer.The dense positioning layer can output each small target and its associated area block by cutting the remote sensing picture.We reconstructed the traditional convolutional neural network in combination with the dense localization layer to construct a new target detection model.2.For the description of the traditional remote sensing image natural language model,the problem of information loss.This paper introduces a long-term and short-term memory network,which replaces the previous classic paradigm framework,making the output more flexible and reducing the loss of information.3.Since the traditional remote sensing image subtitle work consists of two independent modules,it is necessary to spend a lot of manpower on the data labeling of each module.To solve this problem,we built a new network architecture-identifying the network.The recognition network can stretch the features of each candidate region into a fixed length one-dimensional column vector.By identifying the bilinear interpolation in the network and the dense positioning layer,we can realize the end-to-end training of the natural language description generation model of remote sensing images,thereby eliminating the manual data annotation for each independent module,and the model can be based on Data for better self-regulation.
Keywords/Search Tags:Deep learning, Attention Mechanism, Remote Sensing Image Captioning
PDF Full Text Request
Related items