The C̶a̶k̶e̶ User Location is a Lie: Here's Why

cover
31 Jul 2024

I recently sat in on a discussion about programming based on user location. Folks that are way smarter than me covered technical limitations, legal concerns, and privacy rights. It was nuanced, to say the least.

So, I thought I’d share some details.

Location, Location, Location

There are several common examples when you may want to add location-based logic to your app:

  • You want to set the language or currency of your app based on the region.
  • You’re offering discounts to people in a given country.
  • You have a store locator that should show users their nearest location.
  • Your weather app relies on a location before it can offer any sort of data.
  • You want to geofence your app for legal reasons (e.g. cookie banners).

These are just a few use cases. There are plenty more, but from these, we can identify some common themes:

  • Presentation/User experience: Using location information to improve or streamline the user experience.
  • Function/Logic: The application’s business logic changes based on location.
  • Policy/Compliance: You have legal requirements to include or exclude functionality.

It’s not always this clear-cut. There is overlap in some cases, but it’s important to keep these distinctions in mind because getting it wrong has different levels of severity. Showing the wrong currency is not as bad as miscalculating tax rates, which is still not as bad as violating an embargo, for example.

With that in mind, let’s look at the options we have.

Getting User Location

There are four ways I know of to access the user’s location, each with its pros and cons.

  • User reporting
  • Device heuristics
  • IP Address
  • Edge compute

Getting User Location From the User

This is when you have a form on your website that explicitly asks a user where they are. It may offer user experience improvements like auto-completing an address, but ultimately, you are taking the user at their word.

This method has the benefits of being easy to get started (an HTML form will do), provides as reliable information as the user allows, and is flexible to support different locations.

The most obvious downside is that it may not be accurate if the user mistypes or omits information. Furthermore, it’s very easy for a user to provide false information. This can be allowed in some cases, and a big mistake in others.

Take this, for example.

Screenshot of a form asking for the user's location, but the user put, "Dicktown"

This is a legitimate place in New Jersey…it’s ok to laugh. I actually went down a bit of a rabbit hole “researching” real places with funny names and spent way too much time, but I came across some real gems: Monkey’s Eyebrow – KentuckyBig Butt Mountain – North CarolinaUnalaska – AlaskaWhy – ArizonaWhynot – North Carolina.

Full list of real places with funny names

  • Accident, Maryland
  • Bacon Level, Alabama
  • Bald Head, Maine
  • Bat Cave, North Carolina
  • Batman
  • Beaverlick
  • Bell End
  • Bigfoot, Texas
  • Bitter End, Tennessee
  • Booger Hole, West Virginia
  • Boring, Oregon (I’ve been there!)
  • Breeding, Kentucky
  • Burnout, Alabama
  • Burnt Store, Florida
  • Butternuts, New York
  • Butthole Lane
  • Carefree, Arizona
  • Center of the World, Ohio
  • Cheesequake, New Jersey
  • Chicken, Alaska
  • Chugwater, Wyoming
  • Cookietown, Oklahoma
  • Correct, Indiana
  • Dick’s Knob
  • Ding Dong, Texas
  • Disappointment Islands
  • Earth, Texas
  • Eggnog, Utah
  • Fifty-Six, Arkansas
  • Funk, Nebraska
  • Greasy Corner, Arkansas
  • Hell, Michigan
  • Hot Coffee, Mississippi
  • Humptulips, Washington
  • Idiotville
  • Imalone, Wisconsin
  • Intercourse, Pennsylvania
  • Ketchuptown, South Carolina
  • Kickapoo, Kansas
  • Looneyville, Texas
  • Moose Factory
  • Mosquitoville, Vermont
  • Neutral, Kansas
  • New Erection
  • No Name, Colorado
  • Normal, Illinois
  • Nothing, Arizona
  • Peculiar, Missouri
  • Pee Pee, Ohio
  • Red Shirt, South Dakota
  • Sandwich, Massachusetts
  • Satan’s Kingdom
  • Scratch My Arse Rock
  • Slaughterville, Oklahoma
  • Stupid Lake
  • Sweet Lips, Tennessee
  • Worms, Nebraska
  • Zzyzx, California

Anyway, if you decide to take this approach, it’s a good idea to either use a form control with pre-selected options (select or radio) or integrate some sort of auto-complete (location API). This provides a better use experience and usually leads to more complete/reliable/accurate data.

Getting User Location From the Device

Modern devices like smartphones and laptops have access to their location information through GPS, Wi-Fi data, cell towers, and IP addresses. As web developers, we don’t get direct access to this information, for security reasons, but there are some things we can do.

The first thing that comes to mind is the Geolocation API built into the browser. This provides a way for websites to request access to the user’s location with the getCurrentPosition method:

navigator.geolocation.getCurrentPosition(data => {
  console.log(data)
})

The function provides you with a GeolocationPosition object containing latitude, longitude, and other information:

{
  coords: {
    accuracy: 1153.4846436496573
    altitude: null
    altitudeAccuracy: null
    heading: null
    latitude: 28.4885376
    longitude: 49.6407936
    speed: null
  },
  timestamp: 1710198149557
}

Great! Just one problem:

Screenshot of the browser popup when a website asks to access your location.

The first time a website tries to use the Geolocation API, the user will be prompted with a request to share their information.

  • Best case: the user understands the extra step and accepts.
  • Mid case: the user gets annoyed and has a 50/50 chance of accepting or denying.
  • Worst case: the user is paranoid about government surveillance, assumes worst intentions, and never comes back to your app (this is me).

When using an API that requires user verification, it’s often a good idea to let the user know ahead of time to expect the popup, and only trigger it right when you need it. In other words, don’t request access as soon as your app loads. Wait until the user has focused on the location input field, for example.

Getting User Location From Their IP Address

In case you’re not familiar, an IP address looks like this, 192.0.2.1. They are used to uniquely identify and locate devices in a network. This is how computers communicate over the internet, and each packet of data contains information about the IP address of the sender. Your home internet modem is a good example of a device in a network with an IP address.

The relevant thing to note is that you can get location information from an IP address. Each chunk of numbers (separated by periods) represents a subnet from broader to finer scope. You can think of it as going from country to ISP, to region, to user. It doesn’t get fine enough to know someone’s specific address, but it’s possible to get the city or zip code.

Here are two great resources if you want to know more about how this works:

For JavaScript developers like myself, you can access the remote IP in Node.js with response.socket.remoteAddress. And note that you are not getting the user’s IP, technically. You’re getting the IP address for the user’s connection (and anyone else on their connection), by way of their modem and ISP.

Internet user -> ISP -> IP address.

An IP address alone is not enough to know where a user is coming from. You’ll need to look up the IP address subnets against a database of known subnet locations. It usually doesn’t make sense to maintain your own list. Instead, you can download an existing one, or ping a 3rd party service to look it up.

For basic needs, ip2location.com and KeyCDN offer free, limited options. For apps that rely heavily on determining geolocation from IP addresses or need a higher level of accuracy, you’ll want something more robust.

So now, we have a solution that requires no work from the user and has a pretty high level of accuracy. Pretty high accuracy is not a guarantee that the user’s IP address is accurate, as we will see.

Getting User Location From Edge Compute

I’ve written several articles about edge compute in the past, so I won’t go too deep, but edge compute is a way to run dynamic, server-side code against a user’s request from the nearest server. It works by routing all requests through a network of globally distributed servers, or nodes, and allowing the network to choose the nearest node to the user.

The great thing about edge compute is that the platforms provide you with user location information without the need to ask the user for permission or look up an IP address. It can provide this information because every node knows where it lives.

Akamai’s edge compute platform, EdgeWorkers, gives you access to a request object with a userLocation property. This property is a User Location Object that looks something like this:

{
  areaCodes: ["617"],
  bandwidth: "257",
  city: "CAMBRIDGE",
  continent: "NA", // North America
  country: "US",
  dma: "506",
  fips: ["25"],
  latitude: "42.364948",
  longitude: "-71.088783",
  networkType: "mobile",
  region: "MA",
  timezone: "GMT",
  zipCode: "02114+02134+02138-02142+02163+02238",
}

So now we have a reliable source of location information with little effort, the only issue is that it’s not technically the user’s location. The User Location Object actually represents the edge node that received the user’s request. This will be the closest node to the user, likely in the same area. This is a subtle distinction, but depending on your needs, it can make a big difference.

This is Why We Can’t Have Nice Things!

So, we’ve covered some options along with their benefits and caveats, but here’s the real kicker. None of the options we’ve looked at can be trusted.

Can’t trust the user

As mentioned above, we can’t trust users to always be honest and put in their actual location. And even if we could, they could make mistakes. And even if they don’t some data can be mistaken. For example, if I ask someone for their city, and they put “Portland,” how can I be certain they mean Portland, OR (the best Portland), and not one of the 18+ others (in the US, alone)?

Can’t trust the device

The first issue with things like the Geolocation API is that the user can just disallow using it. To which you may respond, “Fine, they can’t use my app then.” But this also fails to address another issue, which is the fact that the Geolocation API information can actually be overwritten by the user in their browser settings. And it’s not even that hard.

Can’t trust the IP address

I’m not sure if it’s possible to spoof an IP address for the computer that is connecting to your website, but it’s pretty easy for a user to route their request through a proxy client. Commonly, this is referred to as a Virtual Private Network or VPN. The user connects to a VPN, their request goes to the VPN first, then the VPN connects to your website. As a result, the IP address you see is the VPN’s, not the user’s. This means any location data you get will be for the VPN, and not the user.

Can’t trust edge compute

Edge compute offers reliable information, but that information is the location of the edge node and not the actual user. Often, they can be close enough, but it’s possible that the user lives near the border of one region and their nearest edge node is on the other side of that border. What happens if you have distinct behavior based on those regional differences?

Also, edge compute is not free from the same VPN issues as IP addresses. With Akamai’s Enhanced Proxy Detection, you can identify if someone is using a VPN, but you still can’t access their original IP address.

What Can We Do About It?

So, there are a lot of ways to get location information, but none of them are entirely reliable. In fact, browser extensions can make it trivial for users to circumvent our efforts. Does that mean we should give up?

No!

I want to leave you better informed and prepared. So let’s look at some examples.

Content Translation

Say we have a website written in English, but also supports other languages. We’d like to improve the user experience by loading the local language of the user.

How should we treat users from Belgium, where they speak Dutch (Flemish), French, and German? Should we default to the most common language (Dutch)? Should default to the default website language (English)?

For the first render of the page, I think it’s safe to either use the default language or the best guess, but the key thing is to let the user decide which is best for them (maybe they only speak French) and honor their decision on subsequent visits.

It could look like this:

  1. User requests the website.
  2. Request passes through edge compute to determine it’s coming from Belgium.
  3. Edge compute looks for the language preference from an HTTP cookie.
  4. If the cookie is present, use the preferred language.
  5. If the cookie is not present, use the English or Dutch version.
  6. On the website, provide the user with a list of predefined, supported languages (maybe using a <select> field).
  7. When the user selects a language preference, store the value in a cookie for future sessions.

In this scenario, we combine edge compute with user reporting to get location information to improve the experience. I don’t think it makes sense to use the Geolocation API at all. There is a risk of showing the wrong language, but the cost is low. The website works even if the location information is wrong or missing.

Weather App

In this example, we have an application that shows the weather information based on location. In this case, the app requires the location information in order to work. How else can we show the weather?

In this scenario, it’s still safe to assume the user’s location on the first load. We can pull that information either from edge compute, or from the IP address, then show (what we think is) the user’s local weather. In addition to that, because the website’s main focus relies on location, we can use the Geolocation API to ask for more accurate data.

We’ll also want to offer a flexible user reporting option in case the user wants information for a different location. For that, a search input with auto-complete to fill in the location information with as much detail as possible. How you handle future visits may vary. You could always default to the “local” weather, or you could remember the location from the previous visit.

  1. User requests the website.

  2. On the first request, start the app assuming location information from edge compute or IP address.

  3. On the first client-side load, initiate the Geolocation API and update the information if necessary.

  4. You can store location information in a cookie for future loads.

  5. For other location searches, provide a flexible input that auto-completes location information and updates the app on submission.

The important thing to note here is that the app doesn’t actually care about where the user is located. We just care about having a location. User-reported location (search) takes precedence over a location found in a cookie, edge compute, or IP address.

Due to the daily change in weather, it’s also worth considering caching strategy and whether the app should be primarily server-rendered or client-rendered.

Store Locator

Imagine you run a brick-and-mortar business with multiple locations. You might show your product catalog and inventory online, but a good practice is to offer up-to-date information about the in-store inventory. For that, you would need to know which store to show inventory for, and for the best user experience, it should be the store closest to the user.

Once again, it makes sense to predict the user’s location using edge compute or IP address. Then, you also want to offer a flexible input that allows the user to put in their location information, but any auto-complete should be limited to the list of stores, sorted by proximity. It’s also good to initiate the Geolocation API.

The difference between this example and the last is that the main purpose of the site is not location-dependent. Therefore, you should wait until the user has interacted with the location-dependent feature. In other words, only ask the user for their location when they’ve focused on the store locator field.

Regional Pricing

This one’s a little tricky, but how would you handle charging different prices based on the user’s location? For example, some airlines and hotels have been reported to have higher prices for users booking from one region vs. another.

Ethics aside, this is a question about profits, which is highly impactful. So, you probably don’t want to allow users to easily change their prices through user-reported location information.

In this case, you’d probably only use edge compute or IP address. It’s possible for users to get around it with a VPN, but it’s probably the best you could do. If you’re really concerned about avoiding scammers, you could use Akamai’s Enhanced Proxy Detection and try blocking requests from VPN users, but that could lead to a no-sale instead of a discounted sale. Up to you.

This last example focuses more on the legal compliance side, so I’ll start with a small disclaimer: I AM NOT A LAWYER!!! This is a hypothetical example and should not be taken as legal advice.

In 2016, the European Union passed the General Data Protection Regulation (GDPR). It’s a law that protects the privacy of internet users in the EU, and it applies to companies that offer goods or services to individuals in the EU, even if the company is based elsewhere.

It has a lot of requirements for website owners, but the one I’ll focus on is the blight of cookie banners we now see everywhere online.

I’ll avoid discussing privacy issues, whether cookie banners are right or wrong, the effectiveness or ineffectiveness of them, or if there is a better approach. Instead, I’ll just say that you may want to only show cookie banners when you are legally required, and avoid them otherwise.

Once again, knowing the user’s location is pretty important. This is very similar to the previous case, and the implementation is similar too. The main difference is the severity of getting it wrong, and therefore the level of effort to get it right.

Cookie banners might be the most ubiquitous example of how legislation and user location can impact a website, but if you’re looking for the most powerful, it’s probably The Great Firewall of China.

Closing

Alright, hopefully, this long and windy road has brought us all to the same place: The magical land of nuance.

We still didn’t touch on a couple of other challenges:

  • What happens when a user changes their location mid-session?
  • What happens if there time zones are involved?
  • How do you report location information for disputed territories?

Still, I hope you found it useful in learning how user location is determined, what challenges it faces, and some ways you might approach various scenarios. Unfortunately, there is no one right way to approach location data. Some scenarios are better suited for user reporting, some are better for device heuristics, and some are better for edge compute or IP address. In most cases, it’s some sort of combination.

The important things you need to ask yourself are:

  • Do you need the user’s location or just any location?
  • How accurate does the data need to be?
  • Is it OK if the user location is falsified?

You also have legal compliance, regulations, and functionality, is 95% reliable ok?

If any of your location logic is for legal reasons, you’ll want to take steps to protect yourself. Account for data privacy laws like CCPA and GDPR. Include messaging in your terms of service to disallow bad behavior. These are some things to consider, but I’m no lawyer. Consult your legal team.

That’s all I have to say about that, but if you’re interested in cloud computing, new customers can sign up at linode.com/austingil for some free credits :)

Thank you so much for reading. If you liked this article, and want to support me, the best ways to do so are to share itsign up for my newsletter, and follow me on Twitter.


Originally published on austingil.com.