XLBoost-Geo: An IP Geolocation System Based on Extreme Landmark Boosting
IP geolocation aims at locating the geographical position of Internet devices, which plays an essential role in many Internet applications. In this field, a long-standing challenge is how to find a large number of highly-reliable landmarks, which is the key to improve the precision of IP geolocation. To this end, many efforts have been made, while many IP geolocation methods still suffer from unacceptable error distance because of the lack of landmarks. In this paper, we propose a novel IP geolocation system, named XLBoost-Geo, which focuses on enhancing the number and the density of highly reliable landmarks. The main idea is to extract location-indicating clues from web pages and locating the web servers based on the clues. Based on the landmarks, XLBoost-Geo is able to geolocate arbitrary IPs with little error distance. Specifically, we first design an entity extracting method based on a bidirectional LSTM neural network with a self-adaptive loss function (LSTM-Ada) to extract the location-indicating clues on web pages and then generate landmarks based on the clues. Then, by measurements on network latency and topology, we estimate the closest landmark and associate the coordinate of the landmark with the location of the target IP. The results of our experiments clearly validate the effectiveness and efficiency of the extracting method, the precision, number, coverage of the landmarks, and the precision of the IP geolocation. On RIPE Atlas nodes, XLBoost-Geo achieves 2,561m median error distance, which outperforms SLG and IPIP.
READ FULL TEXT