30

I use nginx as a load balencer in front of several tomcats. In my incoming requests, I have encoded query parameters. But when the request arrives to tomcat, parameters are decoded :

incoming request to nginx:

curl -i "http://server/1.1/json/T;cID=1234;pID=1200;rF=http%3A%2F%2Fwww.google.com%2F"

incoming request to tomcat:

curl -i "http://server/1.1/json/T;cID=1234;pID=1200;rF=http:/www.google.com/"

I don't want my request parameters to be transformed, because in that case my tomcat throws a 405 error.

My nginx configuration is the following :

upstream tracking  {
    server front-01.server.com:8080;
    server front-02.server.com:8080;
    server front-03.server.com:8080;
    server front-04.server.com:8080;
}

server {
    listen 80;
    server_name tracking.server.com;
    access_log /var/log/nginx/tracking-access.log;
    error_log  /var/log/nginx/tracking-error.log;

    location / {
        proxy_pass  http://tracking/webapp;
    }
}

In my current apache load balancer configuration, I have the AllowEncodedSlashes directive that preserves my encoded parameters:

AllowEncodedSlashes NoDecode

I need to move from apache to nginx.

My question is quite the opposite from this question : Avoid nginx escaping query parameters on proxy_pass

Community
  • 1
  • 1
Jean-Philippe Caruana
  • 2,617
  • 4
  • 25
  • 47
  • Shouldn't your query string start by a question mark ? i.e. `?cID=1234;pID=1200;rF=http%3A%2F%2Fwww.google.com%2F` (see http://stackoverflow.com/questions/3481664/semicolon-as-url-query-separator) – vcarel Dec 10 '13 at 21:54
  • yes, and no Vianney. This is a "feature" of the framework RestEasy... But this will change in the future, believe me :) I think we should use a POST request instead – Jean-Philippe Caruana Dec 11 '13 at 08:52
  • W3c : We recommend that HTTP server implementors, and in particular, CGI implementors support the use of ";" in place of "&" to save authors the trouble of escaping "&" characters in this manner. see http://stackoverflow.com/questions/3481664/semicolon-as-url-query-separator?lq=1 – Jean-Philippe Caruana Dec 12 '13 at 13:55
  • Yes, I read the same. Also http://en.wikipedia.org/wiki/URI_scheme. Besides, making the query string starting by a question mark (followed by semicolons) or directly by a semicolon seems to have a slightly different meaning. According to http://www.ietf.org/rfc/rfc3986.txt, semicolons are "often used" within path's segments, e.g. http://acme.com/foo;v=1/bar;v=2. – vcarel Dec 18 '13 at 12:02

5 Answers5

53

I finally found the solution: I need to pass $request_uri parameter :

location / {
    proxy_pass  http://tracking/webapp$request_uri;
}

That way, characters that were encoded in the original request will not be decoded, i.e. will be passed as-is to the proxied server.

Jean-Philippe Caruana
  • 2,617
  • 4
  • 25
  • 47
  • 9
    Wow, this indeed works and should be the default. I don't see why nginx would need to interfere with the encoding. – Ruben Verborgh Jul 29 '14 at 14:50
  • 1
    This does not work if your location is anything other than root. – Jazzepi May 17 '20 at 02:39
  • If you have done a URL rewrite before passing proxy, you should use `$uri` instead. You can read more on the difference between `$request_uri` and `$uri` – Wicfasho Jul 17 '21 at 23:05
  • 1
    It's frikkin crazy that NGINX is decoding URL parameters and then passing them invalidly into URLs. The same happens when you match `(.*)` in a path and then pass it to a URL rewrite (or proxy_pass) as in `/$1/`. If there are spaces encoded as `%20` in your original URL, NGINX will pass them on as ` ` which is invalid in a URL. – Marc Sep 03 '21 at 08:32
17

Jean's answer is good, but it does not work with sublocations. In that case, the more generic answer is:

location /path/ {
  if ($request_uri ~* "/path/(.*)") {
    proxy_pass http://tracking/webapp/$1;
  }
}
Community
  • 1
  • 1
user1338062
  • 11,939
  • 3
  • 73
  • 67
  • Can anyone provide a link to any documentation containing this info? – Lrnk May 01 '19 at 15:25
  • @Lrnk This answer is using this. -> http://nginx.org/en/docs/http/ngx_http_rewrite_module.html#if to do a PCRE regex. The $1 is the first capture group. You'll see this is a kind of crummy way of telling nginx to drop the first part of the /path/. To be clear this if() will ALWAYS be successful because the if has essentially been validated by the location directive already. – Jazzepi May 17 '20 at 02:42
  • After a million iterations on this stupid problem this answer was the only one that worked for me when I needed to drop the incoming non-root path. – Jazzepi May 17 '20 at 19:15
  • This will work until you have spaces in your path. NGINX witll decode `/foo%20bar/` and send your user to an invalid URL `/webapp/foo bar/`. – Marc Sep 03 '21 at 08:38
16

Note that URL decoding, commonly known as $uri "normalisation" within the documentation of nginx, happens before the backend IFF:

  • either any URI is specified within proxy_pass itself, even if just the trailing slash all by itself,

  • or, URI is changed during the processing, e.g., through rewrite.


Both conditions are explicitly documented at http://nginx.org/r/proxy_pass (emphasis mine):

  • If the proxy_pass directive is specified with a URI, then when a request is passed to the server, the part of a normalized request URI matching the location is replaced by a URI specified in the directive

  • If proxy_pass is specified without a URI, the request URI is passed to the server in the same form as sent by a client when the original request is processed, or the full normalized request URI is passed when processing the changed URI


The solution depends on whether or not you need to change the URL between the front-end and the backend.

  • If no URI change is required:

    # map `/foo` to `/foo`:
    location /foo {
        proxy_pass  http://localhost:8080;  # no URI -- not even just a slash
    }
    
  • Otherwise, if you do need to swap or map /api of the front-end with /app on the backend, then you can get the original URI from the $request_uri variable, and the use the rewrite directives over the $uri variable similar to a DFA (BTW, if you want more rewrite DFA action, take a look at mdoc.su). Note that the return 400 part is needed in case someone tries to get around your second rewrite rule, as it wouldn't match something like //api/.

    # map `/api` to `/app`:
    location /foo {
        rewrite  ^  $request_uri;            # get original URI
        rewrite  ^/api(/.*)  /app$1  break;  # drop /api, put /app
        return 400;   # if the second rewrite won't match
        proxy_pass    http://localhost:8080$uri;
    }
    
  • If you simply want to add a prefix for the backend, then you can just use the $request_uri variable right away:

    # add `/webapp` to the backend:
    location / {
        proxy_pass    http://localhost:8080/webapp$request_uri;
    }
    

You might also want to take a look at a related answer, which shows some test-runs of the code similar to the above.

cnst
  • 25,870
  • 6
  • 90
  • 122
  • Out of all the answers to this I believe this is the best one. I'm not an nginx expert but I believe that the answer given by @user1338062 will cause nginx to parse the uri string twice. Once to hit the location block, and once for the if() statement that trivally returns true. I would think nginx would be smart enough to combine the rewrite rules into the location rules, so this rewrite approach would be better, but I don't know. – Jazzepi May 17 '20 at 02:45
  • In your middle example rewrite ^/api(/.*) /app$1 break; # drop /api, put /app return 400; # if the second rewrite won't match proxy_pass http://localhost:8080$uri; The capture group includes a / but so does the production rule of /$1 which I think results in the output proxy_pass url having double slashes. – Jazzepi May 17 '20 at 03:04
  • I couldn't get this to work. The capture group seemed to stop at the ? in a query with query params, and wouldn't pass them on to the upstream server. – Jazzepi May 17 '20 at 19:15
  • @Jazzepi For double-slash, I don't see any mentions of `/$1` in my answer, so, I think you have your own code doing the double-slash; note that this whole code was fully tested as per my related answer (see link in this very answer), and there's no double-slash there! Thanks for the upvote! For query params, I'm pretty sure everything is still supposed to be preserved as-is; I just tested `echo localhost:4300/{api,dec,mec,nod}/save/http%3A%2F%2Fexample.com\?test\&test | xargs -n1 curl` against the conf from https://stackoverflow.com/a/37584637/1122270, and don't see any issues nor duplicates. – cnst May 18 '20 at 00:07
  • how the second rewrite could ever match as the `location` block is for `/foo` ? – martin Feb 06 '21 at 09:48
  • This example will not work for URLs with spaces. As I mentioned above NGINX will decode `/foo%20bar/` to `/foo bar/` so you cannot use `$1` in all cases as it will be invalid (URLs cannot have spaces). – Marc Sep 03 '21 at 08:40
13

There is one documented option for Nginx proxy_pass directive

If it is necessary to transmit URI in the unprocessed form then directive proxy_pass should be used without URI part:

location  /some/path/ {
  proxy_pass   http://127.0.0.1;
}

so in your case it could be like this. Do not worry about request URI it will be passed over to upstream servers

location / {
    proxy_pass  http://tracking;
}

Hope it helps.

holms
  • 9,112
  • 14
  • 65
  • 95
Casey
  • 1,402
  • 1
  • 19
  • 23
  • Thanks for your answer. I tried it but I really need to add a URI because several webapps are on the same server. – Jean-Philippe Caruana Dec 12 '13 at 13:09
  • 2
    Thanks! This worked for me. The latest jenkins (1.554.1) was complaining that my proxy configuration was wrong due to mishandling of encoded slashes. I changed by nginx configuration to "location /jenkins { proxy_pass [http://localhost:8080](http://localhost:8080); ..}" and the jenkins errors went away. – Gareth May 25 '14 at 19:59
  • 2
    FYI I had this issue because I had a `/` at the end of my backend: like `proxy_pass http://my-backend/`. Removing the final `/` solved the problem (after 4 hours of trial&error). – mettjus Oct 21 '16 at 17:01
0

In some cases, the problem is not on the nginx side - you must set the uri encoding on Tomcat connector to UTF-8.

joshefin
  • 324
  • 4
  • 8