Font Size: a A A

Research On Cross-Modal Classification Of Urban Region Based On Deep Learning

Posted on:2021-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y N ZhaoFull Text:PDF
GTID:2392330605469614Subject:Control engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of artificial intelligence technology,people are increasingly looking forward to the emergence of new "Smart Cities",and the key to building a new "Smart City" lies in the intelligent identification of different urban areas.The daily operation of a city is usually accompanied by massive data generated from mobile terminals,monitoring equipment,satellites and recorders.These data constitute an indirect representation of the functional patterns of different regions,and mining such data can realize the classification of different urban regions.In machine learning,the objective existence form of data is usually called its modal.In the problem of urban regional classification,the learning based on single modal often has some shortcomings,such as information shortage,large bias between classes,and limited classification accuracy.However,there are some shortcomings in multi-modal learning too,such as the inconsistency of research methods and the low efficiency of multi-modal fusion.In order to solve the above problems,this paper selects remote sensing image and user access record data as two different modals,starts with the study of single modal classification,and then proposes a cross-modal classification framework based on deep learning.In view of the remote sensing image data,the classification performance of traditional convolutional neural networks such as VGG16,ResNet,GoogleNet and DenseNet was firstly detected by means of transfer learning in this paper.Then,according to the characteristics of remote sensing image spatial information deficiency,the attention mechanism based on channel and the attention mechanism based on space are introduced to improve the feature learning ability of the network.Experiments show that the classification accuracy of the convolutional neural network with the introduction of attention mechanism is significantly improved on three datasets with different resolution,and the lower the image resolution is,the more obvious the improvement effect is.On the low resolution URFClmages-9 dataset,the overall accuracy of the SE-ResNext network is 4.35%higher than that of DenseNet.Aiming at the data modal of user access record,this paper designs two kinds of data refactoring methods that can make the convolutional neural network realize feature self-learning from the perspective of deep learning.First,the original data is refactored into the access frequency cube according to the time period,week and day the user visits.Then,considering that the two-dimensional convolutional neural network is unable to learn channel features,the frequency cube is expanded based on time sequence,and the corresponding asymmetric convolutional neural network is designed.Experiments have proved that the method of "data refactoring+convolutional neural network" proposed in this paper can not only realize self-learning of data features,but also improve the classification accuracy by 0.52%compared with the machine learning method.Based on single modal learning,this paper proposes a cross-modal classification framework based on deep learning.Firstly,based on the idea of feature fusion,a dual-channel convolutional neural network DC-CNN model is proposed in this paper.The two channels of the DC-CNN model respectively realize feature extraction from remote sensing images and access records.This paper compares the classification enhancement effect of feature fusion and decision fusion.Finally,the paper combined different models with Stacking method.Experiments show that the classification accuracy of DC-CNN is significantly improved compared with that of the single-modal model,thus proving the effectiveness of the multi-modal fusion method in urban area classification.Compared with decision fusion,the classification accuracy of the DC-CNN model based on feature fusion proposed in this paper is improved by 1.87%.Finally,the classification accuracy achieved by mixing fusion is 80.42%.
Keywords/Search Tags:Urban Region Classification, Remote Sensing Image, Access Record, Data Refactoring, Multi-Modal Fusion
PDF Full Text Request
Related items