About Online Matters

PostHeaderIcon A Primer on Geolocation and Location-Based Services: Geolocation from IP Address

 

We now take a slight turn from more exact geolocation technologies to one that is more basic – IP address-based geolocation.  IP address-based geolocation (IPG) was the earliest online geolocation technique and has been around since 1999.  It determines a user’s geographic latitude, longitude and, by inference, city, region and nation by comparing the user’s public Internet IP address with known locations of other electronically neighboring servers and routers.  While IPG is not specific to mobile, it is used in the geolocation of mobile devices by the more complex algorithms.  It is thus worth taking time to understand what it is, how it works, and how accurate it is.

What we will find is that as a stand-alone technology, IPG is not very accurate for the purpose of locating a device with any reasonable degree of accuracy.  Moreover, their is no magic to linking an IP address to a location – it must come from some type of third-party service that  has manually (or semi-manually) mapped IP address to a geolocation.   Even then, without the help of your ISP in providing more information about a device, the best you can do is the location of your ISP’s host server. However, in a later entry, we will discover when combined with other forms of geolocation IP address can be used as an extra signal to confirm location.

Overview

Every device connected to the public Internet is assigned a unique number known as an Internet Protocol (IP) address. IP addresses consist of four numbers separated by periods (also called a ‘dotted-quad’) and look something like 192. 168.0.1.

Since these numbers are usually assigned to Internet service providers within region-based blocks, an IP address can often be used to identify the region or country from which a computer is connecting to the Internet. An IP address can sometimes be used to show the user’s general location.  At one time ISPs issued one IP address to each user. These are called static IP addresses. Because Internet usage exploded far beyond what was envisioned in the early design of the IP standard (known as IPv4) and the number of IP addresses is limited,  ISPs moved toward allocating IP addresses in a dynamic fashion out of a pool of IP addresses using a technology called Dynamic Host Configuration Protocol or DHCP.  This dynamic allocation makes physically locating a device using an IP address tougher.

As we move forward in this discussion, an example will help us understand what is required to convert an IP address to a physical location.  Below are two different services which provide geolocation information from an IP address

example of how two different services - Google and whatismyip - use IP address to determine a location

 

The first service is Google search.  If you type “what is my ip address” into the Google search box, a set of results is returned.  The IP address of the device from which the search was made appears at the top of those results.  On the left-hand side, Google shows it has auto-detected my location in Carmel Valley Village (actually, I am about a mile away and few hundred feet above Carmel Valley Village).

The second service is WhatIsMyIPAddress.com, which is the first organic listing in the result set returned from the “what is my ip address” search.  In this case, the service shows “my device” as being in Salinas, California, about 20 miles away as the crow flies.  Or actually it doesn’t show my device as being in Salinas.  It shows that my ISP is Comcast Cable and that my ISP is in Salinas.

Same query.  Two different services.  Two very different results.  The reason for the difference is that Google is using multiple sources to geolocate my device (IP address, wifi-based latitude and longitude) whereas WhatIsMyIPAddress is only using DNS-based information including traceroute mapping for this particular page.  Once I approve the use of geolocation services, whatismyipaddress yields similar results to Google because it also triangulates across multiple sources.

The rest of this post will delve into why this has occurred, which involves understanding the technology used to perform IP-based address geolocation.

Its a Ping Thing

To understand what is going on, we have to start at the most raw form of the technology underlying IP Addresses, which is the TCP/IP model itself.  Our most basic entry to determine an IP address or a host from an IP address is the ping command.  We won’t go into how ping operates in detail here, but you can find a great overview of this at the GalaxyVisions website. However, by definition the ping command, which is an application and sits in the application layer of the TCP/IP model,  reaches down into the internetworking layer of the TCP/IP model directly with an ICMP Echo Request message and receives back an IP address in an Echo Reply message.  Here is an example of what a ping looks like for 24.130.244.124:

 

What the image shows is that through the echo request/reply, Ping is able to retrieve information about the hostname of the particular IP address, which in this case is a server at comcast.net.   Note the IP address is supposedly the IP address of my computer in my house.  But it isn’t.  Instead what is returned is the location of the server of my ISP to which my account is attached.

From DNS Hostname to Location

So in the prior step, we were able to use Ping to get to a server/domain name.  The next step is to get from the server name to its location.  This is where it took me some time to understand the options available and how they work.  In this section I am going to discuss four:

  • DNS LOC Records
  • Whois Data
  • W3C Geolocation Services in Web browsers
  • Third-Party Service Providers that Map IP Address to a Physical Location

DNS LOC Records

In the Domain Name System there is an experimental standard called RFC 1876 which is named ” A Means for Expressing Location Information in the Domain Name System.”  This standard defines a format for a location record (LOC record) that can be used to geolocate   hosts, networks, and subnets using the DNS lookup.  You can read the standard to get all the details, but the format of the record looks like this:

DNS Location record for RFC 1876

Sample format of a DNS LOC record

The size is the diameter of the sphere surrounding the chosen host and the horiz pre and vert pre give the precision of the estimate.  Latitude, longitude and altitude are pretty obvious as to what they mean.

DNS LOC has two problems.  First, it has only been defined for a few sites around the world.  “Defined” means that as ISP has manually created a LOC record for their hosts (they add the record to their DNS servers).  Second, it once again only gives the location of the host server – in this case hsd1.ca.comcast.net – not the location of my device.

So this doesn’t help us.

WhoIs

Anyone who has used the Internet extensively knows about the whois service.  This service describes the owner of a particular domain.  It is possible to gain some geolocation information about a domain from it – but that data is usually the headquarters of the owner of the domain and has almost no relation to the location of my domain host, much less my computer.  The example from Comcast:

comcast's whois entry for ip address-based geolocation

Example of Comcast's whois Entry

 

Note that there is no entry for my specific hosting server hsd1.ca.comcast.net (left image) and the information about the top-level domain shows it in Wilmington, DE (right image).  Not much help at all.

W3C Geolocation Services in Web Browsers

W3C geolocation services describes an API that provides scripted access to geographical location information, such as latitude and longitude, associated with a specific device via the device’s browser.   Geolocation Services are agnostic of the underlying location information sources which can include Global Positioning System (GPS) and location inferred from network signals such as IP address, RFID, WiFi and Bluetooth MAC addresses, and GSM/CDMA cell IDs, as well as user input.

The API allows developers to access both “one-shot” position requests and repeated position updates.  It also provides the ability cache historic positions and use the data to add geolocation functionality to their applications. For the geeks in the room and so newbies won’t be confused when they find more geolocation-based acronyms,  Geolocation Services builds upon earlier work in the industry, including [AZALOC][GEARSLOC], and [LOCATIONAWARE].

Note that Geolocation Services draws on third-party services – it does not do any geolocation itself from the device.  Thus, all W3C geolocation services – and this is not to minimize their value in developing online and mobile location-aware applications – are simply an aggregation tool to allow developers to draw on whatever third party sources are available to geolocate a device and feed that information into their applications.

Also note that there is nothing explicitly tying these services to the DNS record.  How do you make that connection?

Well, as the next section shows, W3C geolocation services can draw upon a service like hostip.info to get the geographic location of the host.

Third-Party Service Providers

At the end of the day, there is no “magic bullet” technology that links an IP address to a geolocation.  The only way this occurs is through a third-party service that has used numerous, usually labor-intensive and semi-manual techniques, to acquire a geolocation for an IP-address

A part of me wants to chat about Netgeo here – which was one of the earliest attempts to geolocate devices by their IP address.  However, Netgeo has not been maintained , and frankly I’ve covered pretty much everything they discuss in the prior sections.  But if you are interested in this bit of IP address-based geolocation, click on the link.

Having used a number of these services,  I can tell you that the majority do not do a particularly good job of geolocating a host server using an IP address, much less a specific device.  I’ll use hostip.com, as it is the most transparent.  hostip.com uses crowdsourcing to geolocate a host. Developers and ISPs can enter the location of their servers into the hostip database.  The database is then freely available to anyone who wishes to use it.  Here is an example of how my location fared:

Example of hostip.info service

Example of hostip.info IP address-based geolocation

 

Tustin is several hundred miles to the south of my location. So as you can see, not very accurate at all.

 Circling Back to Our Example

So how do these two services handle geolocating my computer?  First, they are both using the W3C geolocation API.  What differs are the sources they use to identify a location.

Obviously Google is relatively accurate in this example, although I do not consider a one mile radius to be particularly useful for those of us who are trying to deliver fine-grained location-based services.  Google manages this through a combination of sources:

If you allow Google Chrome to share your location with a site, the browser will send local network information to Google Location Services to get an estimate of your location. The browser can then share your location with the requesting site. The local network information used by Google Location Services to estimate your location includes information about visible WiFi access points, including their signal strength; information about your local router; your computer’s IP address. The accuracy and coverage of Google Location Services will vary by location.

Google Chrome saves your location information so that it can be easily retrieved. This information is periodically updated; the frequency of updates depends on changes to your local network information.

I should add that when it comes to Android and Android Location Services, Google also uses GPS and Assisted GPS technnologies for geolocation.

 Android Location Services periodically checks on your location using GPS, Cell-ID, and Wi-Fi to locate your device. When it does this, your Android phone will send back publicly broadcast Wi-Fi access points’  Service set identifier (SSID) and Media Access Control (MAC) data.

And it this isn’t just how Google does it; it’s how everyone does it. It’s standard industry practice for location database vendors.

whatismyipaddress, on the other hand, is only using the IP-based address from a third-party service.  This is a choice on their part because I haven’t opted in to use location-based services.  Once I do, I get the map below and my location by one source is correctly shown as Carmel Valley (it is interesting to note, as well,  the different results depending on which third party provider you use).  But this is because we are now triangulating not just from IP address-based geolocation.  whatismyipaddress.com is also using wifi-based geolocation and cell-tower triangulation via the W3C location services API to get a more accurate read.

Conclusion

Basically, after exploring IP address-based geolocation, the conclusion is it is a non-starter for any application but those that can live with the broadest of geolocation options.

Next up: Assisted GPS.

 

Share

Comments are closed.

Posts By Date
May 2012
M T W T F S S
« May   Jul »
 123456
78910111213
14151617181920
21222324252627
28293031