42

On any Heroku stack, I want to get the client's IP. my first attempt might be:

request.headers['REMOTE_ADDR']

This does not work, of course, because all requests are passed through proxies. So the alternative was to use:

request.headers['X-Forwarded-For']

But this is not quite safe, is it?

If it contains only one value, I take this. If it contains more than one value (comma-separated), I could take the first one.

But what if someone manipulates this value? I cannot trust request.headers['X-Forwarded-For'] as I could with request.headers['REMOTE_ADDR']. And there is no list of trusted proxies that I could use, either.

But there must be some way to reliably get the client's IP address, always. Do you know one?

In their docs, Heroku describes that X-Forwarded-For is "the originating IP address of the client connecting to the Heroku router".

This sounds as if Heroku could be overwriting the X-Forwarded-For with the originating remote IP. This would prevent spoofing, right? Can someone verify this?

Jason FB
  • 4,752
  • 3
  • 38
  • 69
caw
  • 30,999
  • 61
  • 181
  • 291
  • 1
    I'm sorry, but what language is this? If it isn't python, how do I do this in python? – aravk33 Feb 13 '18 at 14:51
  • The [Heroku docs](https://devcenter.heroku.com/articles/http-routing#heroku-headers) (same ones you noted) explicitly says **not to trust** the `X-Forwarded-For` header for security reasons. There must have been an update since 2013. – staples Nov 05 '18 at 07:01
  • the original question ask this in the context of `ENV['REMOTE_ADDR']` vs. `ENV['HTTP_X_FORWARDED_FOR']`; modified question to specify that these are request headers (`request.headers`) – Jason FB Jan 01 '22 at 17:49

4 Answers4

59

From Jacob, Heroku's Director of Security at the time:

The router doesn't overwrite X-Forwarded-For, but it does guarantee that the real origin will always be the last item in the list.

This means that, if you access a Heroku app in the normal way, you will just see your IP address in the X-Forwarded-For header:

$ curl http://httpbin.org/ip
{
  "origin": "123.124.125.126",
}

If you try to spoof the IP, your alleged origin is reflected, but - critically - so is your real IP. Obviously, this is all we need, so there's a clear and secure solution for getting the client's IP address on Heroku:

$ curl -H"X-Forwarded-For: 8.8.8.8" http://httpbin.org/ip
{
  "origin": "8.8.8.8, 123.124.125.126"
}

This is just the opposite of what is described on Wikipedia, by the way.

PHP implementation:

function getIpAddress() {
    if (isset($_SERVER['HTTP_X_FORWARDED_FOR'])) {
        $ipAddresses = explode(',', $_SERVER['HTTP_X_FORWARDED_FOR']);
        return trim(end($ipAddresses));
    }
    else {
        return $_SERVER['REMOTE_ADDR'];
    }
}
caw
  • 30,999
  • 61
  • 181
  • 291
  • 1
    https://devcenter.heroku.com/articles/http-routing doesn't go into as much detail as your post, but may be interesting to people wanting to check if the behaviour has changed. The nice thing here is the Rails RemoteIp middleware should work correctly, returning the last trustable IP, which will be the client or whichever untrusted proxy they went through, without needing to configure any Heroku proxy addresses. – nruth Jan 05 '16 at 20:37
  • 1
    I don't think the example you give is the opposite of what wikipedia is saying (now at least). The client connects to heroku with 8.8.8.8 in the header, essentially faking that it's a proxy forwarding a request from 8.8.8.8, then heroku appends the IP of the connection it receives to the list. Or in their list example, a normal request will be [client], and a faked one will be [fakeclient, client]. There are no proxy IPs inserted because Heroku only uses 1 proxy. – nruth Jan 05 '16 at 20:41
  • @nruth What I meant is that Wikipedia says "the left-most being the original client" while Heroku says "the real origin will always be the last item in the list". – caw Jan 08 '16 at 01:23
  • 4
    Both are correct. The 'real origin' _should_ be the leftmost IP, but it can be spoofed. Only the _rightmost_ IP is guaranteed by Heroku, because that the IP that connected to Heroku, but that may be a proxy. Please see Joel Watson's detailed answer for more information. – wuputah May 05 '16 at 22:49
  • 1
    @wuputah I think you are spot on. I am testing this and what I am seeing is that the leftmost address is my own IP, whereas Heroku added a second address... but that's not my IP... So as to the original question it seems you actually need **the first** element from the array (for e.g. GeoIP purposes), not the last one. – Stijn de Witt Nov 22 '16 at 10:16
  • There's a typo in this answer "X-Forwareded-For" instead of "X-Forwarded-For". Tripped me up ‍♂️ – David Hariri May 10 '17 at 14:13
  • @DavidHariri Thanks, fixed! – caw May 10 '17 at 18:54
  • For anyone who is working with express in node, the key for the right header is all lowercase: `req.headers['x-forwarded-for']` – Felipe Jan 25 '18 at 05:22
  • I believe this isn't correct anymore, and the answer below is. If the request comes through a proxy, the proxy IP will be last in the list. – Elias Dorneles Jan 21 '19 at 14:28
  • Since this was mentioned in the comments, the Rails `remote_ip` method does not appear to work with Heroku. It appears to get a proxy address. – B Seven Dec 01 '19 at 23:34
51

I work in Heroku's support department and have spent some time discussing this with our routing engineers. I wanted to post some additional information to clarify some things about what's going on here.

The example provided in the answer above just had the client IP displayed last coincidentally and that's not really guaranteed. The reason it wasn't first is because the originating request claimed that it was forwarding for the IP specified in the X-Forwarded-For header. When the Heroku router received the request, it just appended the IP that was directly connecting to the X-Forwarded-For list after the one that had been injected into the request. Our router always adds the IP that connected to the AWS ELB in front of our platform as the last IP in the list. This IP could be the original one (and in the case where there's only one IP, it almost certainly is), but the instant there are multiple IPs chained, all bets are off. Convention is always to add the latest IP in the chain to the end of the list (which is what we do), but at any point along the chain that chain can be altered and different IPs could be inserted. As such, the only IP that's reliable (from the perspective of our platform) is the last IP in the list.

To illustrate, let's say someone initiates a request and arbitrarily adds 3 additional IPs to the X-Forwarded-For header:

curl -H "X-Forwarded-For: 12.12.12.12,15.15.15.15,4.4.4.4" http://www.google.com

Imagine this machine's IP was 9.9.9.9 and that it had to pass through a proxy (e.g., a university's campus-wide proxy). Let's say that proxy had an IP of 2.2.2.2. Assuming it wasn't configured to strip X-Forwarded-For headers (which it likely wouldn't be), it would just tack the 9.9.9.9 IP to the end of the list and pass the request on to Google. At this point, the header would look like this:

X-Forwarded-For: 12.12.12.12,15.15.15.15,4.4.4.4,9.9.9.9

That request will then pass through Google's endpoint, which will append the university proxy's IP of 2.2.2.2, so the header will finally look like this in Google's logs:

X-Forwarded-For: 12.12.12.12,15.15.15.15,4.4.4.4,9.9.9.9,2.2.2.2

So, which is the client IP? It's impossible to say from Google's standpoint. In reality, the client IP is 9.9.9.9. The last IP listed is 2.2.2.2 though and the first is 12.12.12.12. All Google would know is that the 2.2.2.2 IP is definitely correct because that was the IP that actually connected to their service – but they wouldn't know if that was the initial client for the request or not from the data available. In the same way, when there's just one IP in this header – that is the IP that directly connected to our service, so we know it's reliable.

From a practical standpoint, this IP will likely be reliable most of the time (because most people won't be bothering to spoof their IP). Unfortunately, it's impossible to prevent this sort of spoofing and by the time a request gets to the Heroku router, it's impossible for us to tell if IPs in an X-Forwarded-For chain have been tampered with or not.

All reliability issues aside, these IP chains should always be read from left-to-right. The client IP should always be the left-most IP.

Joel Watson
  • 611
  • 6
  • 4
  • 3
    I was following up until the last paragraph. Should that say that IP chains should be read from right-to-left and that the client IP should be the right-most IP? – Aaron Apr 05 '17 at 13:21
  • 2
    The client IP is the left-most IP. Convention is that additional IPs are appended to the list as they're encountered, so the very first IP in the list should be the actual client IP you want in most cases. Other IPs in the list are intermediates. Just keep in mind that the IPs in the list can be arbitrarily modified at any point in the request chain, so you're not _guaranteed_ that the IP is correct. – Joel Watson Jun 20 '17 at 18:37
  • So in the above example would `12.12.12.12` be considered the client's IP even though it is `9.9.9.9` because they intentionally added the additional IPs to the header? – Aaron Jun 21 '17 at 11:02
  • 2
    I can verify after testing that the client IP is the FIRST not the LAST - glad I tested before deploying =p – Roi Jul 27 '17 at 22:59
4

You can never really trust any information coming from the client. It's more of a question of who do you trust and how do you verify it. Even Heroku can possibly be influenced to provide a bad HTTP_X_FORWARDED_FOR value if they have a bug in their code, or they get hacked somehow. Another option would be some other Heroku machine connecting to your server internally and bypassing their proxy altogether while faking REMOTE_ADDR and/or HTTP_X_FORWARDED_FOR.

The best answer here would depend on what you're trying to do. If you're trying to verify your clients, a client-side certificate might be a more appropriate solution. If all you need the IP for is geo-location, trusting the input might be good enough. Worst case, someone will fake the location and get the wrong content... If you have a different use case, there are many other solutions in between those two extremes.

kichik
  • 33,220
  • 7
  • 94
  • 114
  • 8
    Thank you! I'm just asking because Heroku _knows_ the client's real IP. This is just the IP that Heroku's proxy received the request from. If _that_ is a proxy again, we don't have to care. This is the normal situation you would also have with `REMOTE_ADDR` when the client is behind a proxy. Heroku being hacked or having a bug in their code is an exception and we should not care, either. Because we cannot do anything about it. But if Heroku did just _overwrite_ the `HTTP_X_FORWARDED_FOR` we would always know the client's real IP, which we would get with `REMOTE_ADDR` normally. Right? – caw Aug 16 '13 at 01:50
  • Yes, that is correct. A bit more on that is available on [Wikipedia](http://en.wikipedia.org/wiki/X-Forwarded-For). – kichik Aug 16 '13 at 01:58
1

If I make a request with multiple X-Forwarded-For headers: curl -s -v -H "X-Forwarded-For: 1.1.1.1, 1.1.1.2, 1.1.1.3" -H "X-Forwarded-For: 2.2.2.2" -H "X-Forwarded-For: 3.3.3.3" https://foo.herokuapp.com/

> X-Forwarded-For: 1.1.1.1, 1.1.1.2, 1.1.1.3
> X-Forwarded-For: 2.2.2.2
> X-Forwarded-For: 3.3.3.3

The X-Forwarded-For header passed along to the app will be:

1.1.1.1, 1.1.1.2, 1.1.1.3, <real client IP>, 2.2.2.2, 3.3.3.3

so picking the last from that list does not hold up :/

dentarg
  • 1,694
  • 2
  • 12
  • 20