| The level of urbanization in our country is constantly improving,and urban development has entered an important period of urban renewal.Refined space "stock mining" is a new trend in urban development,clarifying the layout of urban functional areas plays an important role in strengthening stock utilization and optimizing urban industrial structure.With the advent of the era of big data,new data types continue to emerge.Among them,the point of interest is a point-like spatial data representing a geographic entity,which has the characteristics of large volume,high precision and strong timeliness.At the same time,combined with the emerging natural language processing deep learning model,it can fully mine the semantic information of spatial big data.Therefore,based on POI data and road network data,this paper constructs a research framework for geographic semantic feature extraction,urban functional area identification,accuracy comparison and evaluation,and identifies urban functional areas in Beijing.The main research work of this paper is as follows:First of all,data preprocessing and data cleaning were performed on point of interest data and road network data.For POI data,its spatial location correlation information was fully considered and constructed sequences.Using Morphological Image Segmentation Technology to Extract Road Centerline from Road Network Data.For basic research units,compared the grids and road networks division methods and the corresponding POI-related indicators in the research units under the three scales,and selected the best research unit.Secondly,the latent semantic features of study units and POI classes were extracted in combination with natural language processing models.POI classes were treated as vocabulary in text,POI sequence was treated as sentences,and research units were treated as documents,and a corpus was built for training word vectors and paragraph vectors.High-dimensional semantic feature vectors that can represent spatial distribution were obtained by training word vectors and paragraph vectors,and the functional similarity between POI classes was compared using the trained word vectors.Thirdly,the K-means clustering method was used to cluster the paragraph vectors obtained from the training,and the LDA model was used to extract the theme of the clustering results(recognition results of urban functional areas),and the functional areas were marked with the enrichment factor.Finally,model accuracy evaluations of TF-IDF,Word2 Vec,and Doc2 Vec were compared by using a random forest classifier.The results showed that drawing on the emerging natural language processing models can effectively extract the latent semantic features of basic urban research units.The POI class vectors trained in the model can also calculate the functional similarity between various types.Moreover,the Doc2 Vec model was significantly better than the TF-IDF and Word2 Vec models in the identification of urban functional areas.This study can dynamically monitor the development of urban space,provide reference for urban planning and provide new methods and ideas for the division of POI categories in urban space. |