| With the rapid development of the Internet,IP geolocation is widely used in the fields and has played an important role.online fraud detection and targeted advertising and other location based services all required the support of IP geolocation.Since IP geolocation was proposed and developed,there have been a lot of excellent IP-geo algorithms and IP-geo products.However these IP geolocation technologies have their own limitations,such as low accuracy,poor stability,etc.High-precision IP geolocation still face huge challenges.In order to solve these problems,this thesis conducts IP geolocation research for two IP protocol addresses.The main work and innovations are as follows:(1)This thesis proposes a city-level IP geolocation algorithm based on Light GBM for IPv6 addresses geolocation.IPv6 addresses are characterized by a large address space but few active addresses,and the feasibility of locating IPv6 addresses by relying on network measurements is low.However,IPv6 addresses are allocated strictly according to the specified structure,and they show the characteristics of clustered distribution after allocation.Based on this,this thesis proposes a city-level IPv6 geolocation method from the perspective of IPv6 address allocation strategy,obtaining provincial localization information and reliable landmark points through the method of multi-source database fusion,and generating Light GBM city classifier by extracting features from the location information contained in the IPv6 address itself to predict the city.In this thesis,IPv6 addresses of ten cities in Shaanxi Province are selected for city-level IP-geo experiments.The results show that the prediction accuracy of the method reaches 97% at the city level,which is better than the random forest IP geolocation method that performs better in the city-level localization of IPv4 addresses in terms of geolocation accuracy and efficiency.(2)And a twice-constraint IP geolocation algorithm is proposed based on routing-level paths and deep neural networks for IPv4 addresses geolocation.Firstly,IP addresses with high geographical confidence are collected and selected as landmark points,and network measurements are performed on the landmark points to generate route-level path topology and delay database.A distribution similarity-based delay calculation method is proposed to find multiple landmark points with the highest delay similarity under the same route as the target IP by analyzing the route-level path topology,and locate the target in multiple constraint regions centered on the selected landmark points to achieve primary constraint IP-geo of the target IP.By extracting the information features from the constructed delay database,the deep neural network model is trained to predict the latitude and longitude coordinates of the target to achieve the secondary constraint IP-geo of the target,and the coordinates of the most likely location of the target are calculated as the final position of the algorithm by correcting the two obtained positions with each other.The experimental results show that the proposed twice-constraint IP positioning algorithm can control 97%of the IP-geo accuracy to be measured within 12 km and the average error reaches 8.1km,which is better than the traditional single network measurement IP-geo method and machine learning IP-geo method in terms of accuracy and stability.Finally,the work in this thesis is summarized,and the next research directions to improve the accuracy of IP geolocation technology are prospected. |