Font Size: a A A

Image-Image And Image-Text Searching Methods For Remote Sensing Images

Posted on:2024-04-11Degree:MasterType:Thesis
Country:ChinaCandidate:Z J ZhouFull Text:PDF
GTID:2542307079970729Subject:Electronic information
Abstract/Summary:PDF Full Text Request
Remote sensing images are surface images of the earth’s objects obtained through remote sensing sensors on board platforms such as drones,aircrafts,and satellites.Efficient,systematic,and reasonable remote sensing image retrieval technology can provide reliable support for environmental monitoring,urban planning,agricultural management,and disaster response.Remote sensing image retrieval refers to querying specified targets in massive remote sensing data based on user needs.Due to the fact that remote sensing data comes from different platforms,remote sensing image retrieval involves both multi-source remote sensing image retrieval and multimodal remote sensing image-text retrieval.On one hand,the current research on multi-source remote sensing image retrieval lacks effective utilization of geographical information,particularly in situations with weak GNSS signals or multipath effects,where multisource image matching is used to achieve geolocation.On the other hand,remote sensing images lack direct interpretability,and there is an urgent need for a remote sensing information processing method,i.e.multimodal remote sensing image-text retrieval,to assist in remote sensing image interpretation and related work.At the same time,existing remote sensing image and image-text retrieval methods still have a lot of room for improvement in accuracy.In response to the above problems,thesis conducted the following main research work:(1)A Transformer-based block registration network(Tba Geo)is proposed.This method improves upon the fixed block size limitation of traditional vision Transformers and utilizes the block segmentation and positional encoding features to register block relationships of the same instance area in different perspective images.Finally,key instance information is enhanced by identifying attention salient regions.Results show that the accuracy of the Tba Geo network ranks among the top compared with current advanced methods.The R@1(recall rate of the top-ranked result)in geo-registration reaches 85.54%,and the AP(average precision)reaches 87.62%.In navigation tasks,the R@1 reaches 91.43%,and the AP reaches 85.87%.Partial visualization results validate the accuracy of this method in multi-view image retrieval tasks.(2)Established a multimodal remote sensing image-text cross-modal retrieval network(CAMGS)based on cross-attention and multi-granularity semantic association.This method designs a new framework for integrating cross-semantic inference and multigranularity relationship understanding,which can accurately locate object regions in images through text and enhance the matching of global contextual information by joint multi-scale learning,thus obtaining more accurate retrieval performance.Retrieval results based on two remote sensing image and text datasets demonstrate that the CAMGS method has improved the metrics compared to other advanced methods in both text and image retrieval.Taking the RSITMD dataset as an example,the algorithm achieves R@1and R@10 of 12.24% and 41.63% in text retrieval and R@1 and R@10 of 10.69% and53.09% in image retrieval.Conducting ablation experiments and partial visual retrieval confirms the feasibility of the proposed model.This study explores targeted solutions to the main challenges faced in image and image-text retrieval of remote sensing imagery.By enabling remote sensing image retrieval,it provides crucial support for geolocation and remote sensing interpretation,among other fields.
Keywords/Search Tags:Block matching, Cross-attention, Multi-source remote sensing image retrieval, Multimodal remote sensing text-image retrieval, Remote sensing imagery
PDF Full Text Request
Related items