Font Size: a A A

Research On Construction Method Of POI Database Based On Crowd Sourcing Data Matching And Fusion

Posted on:2021-04-18Degree:MasterType:Thesis
Country:ChinaCandidate:C R TianFull Text:PDF
GTID:2370330629985289Subject:Cartography and Geographic Information System
Abstract/Summary:PDF Full Text Request
As an important part of geospatial information,POI data is an extremely important basic resource.It has a wide range of application values and modes in many fields,and is also of great significance to user-oriented location services.With the continuous development of mobile Internet and crowdsourcing technologies,a large number of crowdsourced POI data resources are available on the network,and the demand for geographic information location services is also increasing.The main characteristics of crowdsourced POI data include large volume,strong being up-to-date,rich information,and uncertain quality.These POIs can complement each other,and conflating them can provide more geographic information location services.However,different sources of POI data have different geometric and semantic expressions,and varying degree of information integrity and richness.The same POI entity may have different name expressions,geometric positions or classification systems,and lack a unified expression model.Therefore,this thesis will use the crowdsourced POI data as the research object,explore the precise matching algorithm of the heterogeneous POI data,and study the data fusion rules based on matched POI pairs.Based on a unified expression model,this thesis will create a POI database using the proposed match algorithm and data fusion strategies.At the same time,W3 C PROV model is used to record the provenance information during conflating different POI data sources.First,at the POI data matching phase,we take different similarity measures into consideration and combine them using machine learning models,and then introduce graph theory in order to achieve precise matching.The selection of similarity measures is crucial for POI matching,because different similarity measures and even different combination methods often result in different matching accuracy.In this thesis,based on the characteristics of the spatial location,classification system,and name attributes of the POI data,five similarity measures are selected to participate in the matching calculation,and a matching method combining machine learning and graph theory algorithms is proposed.The machine learning model replaces the multi-similarity weighted calculation step based on manual assignment of weights.Since graph theory algorithm can effectively identify one-to-one POI matching point pairs,it is used to eliminate mismatched point pairs,and achieve accurate match of similar point pairs.Secondly,at POI data fusion phase,this thesis analyzes and summarizes the differences that exist in the crowdsourced POI data sources.We identify the conflicts of three categories: naming conflict,structural conflict,and attribute conflict.Combined with the attribute richness and attribute characteristics of the POI data,the resolution method of attribute conflicts is emphasized e.Three conflict resolution strategies are proposed: conflict neglect strategy,conflict avoidance strategy,and conflict resolution strategy.In this way,the attribute fusion rules of crowdsourced POI data are constructed,which effectively avoids the phenomenon of redundant attributes and key data missing in the POI data.Finally,in terms of the construction of the POI entity database,a unified semantic expression model of heterogeneous POI data from multiple sources is established.POI entity database is constructed based on the matching and fusion of crowdsourced POI data.Then the source information of the conflated POI data is traced and expressed based on the W3 C PROV-DM model.
Keywords/Search Tags:Crowdsourced POI Data, Machine Learning, Graph Theory, Semantic Expression Model
PDF Full Text Request
Related items