0

So, I've seen Tips for debugging .htaccess rewrite rules - but I have a hard time making use of the suggestions there.

Basically, what I would wish for, is to see what the input and output is of every RewriteCond and RewriteRule entry, as well as the order in which they run, but I cannot seem to find anything that will help me understand that.

So here is a brief example, from what I've read so far: I'm running Apache2 locally on my Windows 10 machine; and I have the following in httpd.conf:

<Location /subfold/dl>

RewriteEngine On
RewriteCond %{REQUEST_URI} ^/subfold/dl/
RewriteRule ^subfold/dl/(.*)/(.*)$ httpdocs__$1__$2 [ENV=RSLT:AAA,NE]
SetEnvIf Request_URI "^/subfold/dl/" REWRGO=Yes
# "without 'always' your additions will only go out on succesful responses"
Header always set X-REWRGO "%{REWRGO}e"
Header always set X-RSLT "%{RSLT}e"

</Location>

So, here I basically create a wrong rewrite rule, which means the server will always respond with 404.

Because of that, I have to use the always keyword with Header - otherwise response headers never get sent (see Apache: difference between "Header always set" and "Header set"?).

Eventually, what I'd want, is to capture [ENV=RSLT:httpdocs__$1__$2,NE] the output of the rewrite rule in an environment variable, and then write that variable in the response header. Unfortunately, that does not quite work - when I try to issue a request with curl:

$ curl -IkL http://127.0.0.1/subfold/dl/my/test.html
HTTP/1.1 404 Not Found
Date: Mon, 18 Oct 2021 10:29:12 GMT
Server: Apache/2.4.46 (Win32) OpenSSL/1.1.1j
X-REWRGO: Yes
X-RSLT: (null)
Content-Type: text/html; charset=iso-8859-1

... we can see that the REWRGO variable value did end up in the header - but the RSLT is seen as having value null, even if it was explicitly set to value AAA; furthermore, my log settings:

LogLevel notice setenvif:trace8 rewrite:trace8

... print this in Apache2's error.log:

[Mon Oct 18 12:29:12.151566 2021] [rewrite:trace2] [pid 11368:tid 1932] mod_rewrite.c(483): [client 127.0.0.1:9848] 127.0.0.1 - - [127.0.0.1/sid#14c6370][rid#7031e10/initial] init rewrite engine with requested uri /subfold/dl/my/test.html

[Mon Oct 18 12:29:12.152573 2021] [rewrite:trace1] [pid 11368:tid 1932] mod_rewrite.c(483): [client 127.0.0.1:9848] 127.0.0.1 - - [127.0.0.1/sid#14c6370][rid#7031e10/initial] pass through /subfold/dl/my/test.html

[Mon Oct 18 12:29:12.152573 2021] [setenvif:trace2] [pid 11368:tid 1932] mod_setenvif.c(630): [client 127.0.0.1:9848] Setting REWRGO

[Mon Oct 18 12:29:12.152573 2021] [rewrite:trace3] [pid 11368:tid 1932] mod_rewrite.c(483): [client 127.0.0.1:9848] 127.0.0.1 - - [127.0.0.1/sid#14c6370][rid#7031e10/initial] [perdir /subfold/dl/] add path info postfix: C:/bin/Apache24/htdocs/subfold -> C:/bin/Apache24/htdocs/subfold/dl/my/test.html

[Mon Oct 18 12:29:12.152573 2021] [rewrite:trace3] [pid 11368:tid 1932] mod_rewrite.c(483): [client 127.0.0.1:9848] 127.0.0.1 - - [127.0.0.1/sid#14c6370][rid#7031e10/initial] [perdir /subfold/dl/] applying pattern '^subfold/dl/(.*)/(.*)$' to uri 'C:/bin/Apache24/htdocs/subfold/dl/my/test.html'

[Mon Oct 18 12:29:12.153551 2021] [rewrite:trace1] [pid 11368:tid 1932] mod_rewrite.c(483): [client 127.0.0.1:9848] 127.0.0.1 - - [127.0.0.1/sid#14c6370][rid#7031e10/initial] [perdir /subfold/dl/] pass through C:/bin/Apache24/htdocs/subfold

So, I have everything in this log, except what I need - and that is: what was the final URL string that came out of the RewriteRule (wrong or not)?

But, regardless - the error log shows me, at least, that the RewriteRule has been triggered, and that would make me expect that the RSLT environment variable should be set too; but it is not - as we can see in the header, it is null. So cannot really use this technique for debugging, either.

(I guess, this environment variable problem is due to, as noted in RewriteCond with SetEnv , which quotes Setting Environment Variables:

The SetEnv directive runs late during request processing meaning that directives such as SetEnvIf and RewriteCond will not see the variables set with it.

But then, the same post says setting an environment variable within the RewriteRule should work - and yet, in my example, it does not?)

Can anyone help me with, how can I correctly capture the output of a RewriteRule, and then print it somehow - for instance, as a header value - for each request URL that I might send through, say, curl?

sdbbs
  • 4,270
  • 5
  • 32
  • 87

1 Answers1

0

Ok, managed to get a bit further with this.

First, let's consider this httpd.conf snippet:

<Location /subfold/dl>

RewriteEngine On
RewriteOptions AllowAnyURI
RewriteCond %{REQUEST_URI} ^/subfold/dl/
#RewriteRule subfold/dl/(.*)/(.*)$ httpdocs__$1__$2
RewriteRule (.*) - [E=RSLT:$1,CO=;frontdoor;yes_"$1";127.0.0.1;1440;/,NE,PT,L]
SetEnvIf Request_URI "^/subfold/dl/" REWRGO=Yes
# "without 'always' your additions will only go out on succesful responses"
SetEnvIf Cookie "frontdoor=([^;]+)" MyFrontDoor=%1
Header always set X-REWRGO "%{REWRGO}e"
Header always set X-RSLT "%{RSLT}e"
Header always set X-FRONTDOOR "%{MyFrontDoor}e"
Header always set X-Request_URI1 "%{REQUEST_URI}e"
Header always set X-Request_URI2 "expr=%{REQUEST_URI}"

</Location>

When I have this in my httpd.conf, and I start Apache httpd.exe, and I issue a request, this is what I get now:

$ curl -IkL http://127.0.0.1/subfold/dl/my/test.html
HTTP/1.1 404 Not Found
Date: Mon, 18 Oct 2021 12:35:36 GMT
Server: Apache/2.4.46 (Win32) OpenSSL/1.1.1j
X-REWRGO: Yes
X-RSLT: C:/bin/Apache24/htdocs/subfold/dl/my/test.html
X-FRONTDOOR: (null)
X-Request_URI1: (null)
X-Request_URI2: /subfold/dl/my/test.html
Set-Cookie: frontdoor=yes_"C:/bin/Apache24/htdocs/subfold/dl/my/test.html"; path=/; domain=127.0.0.1; expires=Tue, 19-Oct-2021 12:35:36 GMT
Content-Type: text/html; charset=iso-8859-1

So, something started working ... and what I gather from this:

First, the AllowAnyURI is not really needed here - I thought it was because https://httpd.apache.org/docs/2.4/mod/mod_rewrite.html says:

When RewriteRule is used in VirtualHost or server context with version 2.2.22 or later of httpd, mod_rewrite will only process the rewrite rules if the request URI is a URL-path.

I was afraid that my request URI might not be recognized, so I enabled AllowAnyURI - but that does not seem to be the issue here.

So here is what this tests: with RewriteRule (.*) - we basically have a "passthrough" - the original request does not get rewritten; again https://httpd.apache.org/docs/2.4/mod/mod_rewrite.html :

The Substitution may be a: ...
- (dash)
A dash indicates that no substitution should be performed (the existing path is passed through untouched). This is used when a flag (see below) needs to be applied without changing the path.

... and we also have the corresponding [PT] flag there. Then, since the entire match request is in parentheses (.*) and therefore captured, we can output it with $1 in the same line; and finally, we both set an environment variable RSLT to this, as well as a cookie.

Now here is where I got surprised: what ends up in the "Pattern" part of the rewrite rule is not the original URL - it is what Apache would see as the translation of this URL to the local filesystem!!!

I first noticed this, when I got split values for the cookie, as it always splits on colons (:) - even if it is string data in a variable (which however did contain C:/.., which is what end up being split)! Hence I tried to use double quotes, but that did not work - what worked eventually was https://httpd.apache.org/docs/2.4/rewrite/flags.html#flag_co :

If a literal ':' character is needed in any of the cookie fields, an alternate syntax is available. To opt-in to the alternate syntax, the cookie "Name" should be preceded with a ';' character, and field separators should be specified as ';'.

In any case, here we've had:

  • http://127.0.0.1 /subfold/dl/my/test.html - original request
  • C:/bin/Apache24/htdocs /subfold/dl/my/test.html - what Apache mapped that request to, in the local filesystem

And essentially, this was the problem with the code in the OP: RewriteRule ^subfold/dl/(.*)/(.*)$ ... would mean: replace the string that starts with subfold... - however, the string ``C:/bin/...` definitely does not start like that, and so the RewriteRule never matches, and does not execute!

I guess, another confusion for me was, that (.*) as the Pattern (first arg) in RewriteRule, typically looks like the REQUEST_URI - but that is within .htaccess in a directory; https://httpd.apache.org/docs/2.4/mod/mod_rewrite.html again:

  • In VirtualHost context, The Pattern will initially be matched against the part of the URL after the hostname and port, and before the query string (e.g. "/app1/index.html"). This is the (%-decoded) URL-path.

  • In per-directory context (Directory and .htaccess), the Pattern is matched against only a partial path, for example a request of "/app1/index.html" may result in comparison against "app1/index.html" or "index.html" depending on where the RewriteRule is defined.

... where

REQUEST_URI The path component of the requested URI, such as "/index.html". This notably excludes the query string which is available as its own variable named QUERY_STRING

So, my bad was, that I was assuming (.*) as Pattern in the RewriteRule would be matched to the path component (REQUEST_URI-like) and I attempted matching versus that; whereas, in this case, it was actually the full filesystem path, starting with C:/...!

The being said, we can now uncomment the line RewriteRule subfold/dl/(.*)/(.*)$ httpdocs__$1__$2 in the above snippet - realising that this time, the match does not have the starting caret (^), which would means "match the start of string":

<Location /subfold/dl>

RewriteEngine On
RewriteCond %{REQUEST_URI} ^/subfold/dl/
RewriteRule subfold/dl/(.*)/(.*)$ httpdocs__$1__$2
RewriteRule (.*) - [E=RSLT:$1,CO=;frontdoor;yes_"$1";127.0.0.1;1440;/,NE,PT,L]
SetEnvIf Request_URI "^/subfold/dl/" REWRGO=Yes
# "without 'always' your additions will only go out on succesful responses"
SetEnvIf Cookie "frontdoor=([^;]+)" MyFrontDoor=%1
Header always set X-REWRGO "%{REWRGO}e"
Header always set X-RSLT "%{RSLT}e"
Header always set X-FRONTDOOR "%{MyFrontDoor}e"
Header always set X-Request_URI1 "%{REQUEST_URI}e"
Header always set X-Request_URI2 "expr=%{REQUEST_URI}"

</Location>

... and the response to this is:

$ curl -IkL http://127.0.0.1/subfold/dl/my/test.html
HTTP/1.1 404 Not Found
Date: Mon, 18 Oct 2021 13:18:33 GMT
Server: Apache/2.4.46 (Win32) OpenSSL/1.1.1j
X-REWRGO: Yes
X-RSLT: C:/bin/Apache24/htdocs/subfold/dl/httpdocs__my__test.html
X-FRONTDOOR: (null)
X-Request_URI1: (null)
X-Request_URI2: /subfold/dl/httpdocs__my__test.html
Set-Cookie: frontdoor=yes_"httpdocs__my__test.html/dl/my/test.html"; path=/; domain=127.0.0.1; expires=Tue, 19-Oct-2021 13:18:33 GMT
Content-Type: text/html; charset=iso-8859-1

So, yeah - finally I can see the expected mod_rewrite string substitution/replacement in headers, even if the rewritten URI is incorrect (i.e. does not correctly map to existing local filesystem data) -- which is what I wanted to achieve.

For fun, this is what comes out of the Apache error (that is, mod_rewrite) log:

[Mon Oct 18 15:21:55.007369 2021] [rewrite:trace2] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] init rewrite engine with requested uri /subfold/dl/my/test.html
[Mon Oct 18 15:21:55.007369 2021] [rewrite:trace1] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] pass through /subfold/dl/my/test.html
[Mon Oct 18 15:21:55.008369 2021] [setenvif:trace2] [pid 15508:tid 1936] mod_setenvif.c(630): [client 127.0.0.1:12206] Setting REWRGO
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace3] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] [perdir /subfold/dl/] add path info postfix: C:/bin/Apache24/htdocs/subfold -> C:/bin/Apache24/htdocs/subfold/dl/my/test.html
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace3] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] [perdir /subfold/dl/] applying pattern 'subfold/dl/(.*)/(.*)$' to uri 'C:/bin/Apache24/htdocs/subfold/dl/my/test.html'
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace4] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] [perdir /subfold/dl/] RewriteCond: input='/subfold/dl/my/test.html' pattern='^/subfold/dl/' => matched
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace2] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] [perdir /subfold/dl/] rewrite 'C:/bin/Apache24/htdocs/subfold/dl/my/test.html' -> 'httpdocs__my__test.html'
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace3] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] [perdir /subfold/dl/] add per-dir prefix: httpdocs__my__test.html -> /subfold/dl/httpdocs__my__test.html
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace3] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] [perdir /subfold/dl/] add path info postfix: /subfold/dl/httpdocs__my__test.html -> /subfold/dl/httpdocs__my__test.html/dl/my/test.html
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace3] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] [perdir /subfold/dl/] strip per-dir prefix: /subfold/dl/httpdocs__my__test.html/dl/my/test.html -> httpdocs__my__test.html/dl/my/test.html
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace3] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] [perdir /subfold/dl/] applying pattern '(.*)' to uri 'httpdocs__my__test.html/dl/my/test.html'
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace5] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] setting env variable 'RSLT' to 'httpdocs__my__test.html/dl/my/test.html'
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace5] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] setting cookie 'frontdoor=yes_"httpdocs__my__test.html/dl/my/test.html"; path=/; domain=127.0.0.1; expires=Tue, 19-Oct-2021 13:21:55 GMT'
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace2] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] [perdir /subfold/dl/] forcing '/subfold/dl/httpdocs__my__test.html' to get passed through to next API URI-to-filename handler
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace2] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] [perdir /subfold/dl/] trying to replace context docroot C:/bin/Apache24/htdocs with context prefix
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace1] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] [perdir /subfold/dl/] internal redirect with /subfold/dl/httpdocs__my__test.html [INTERNAL REDIRECT]
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace2] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b1a020/initial/redir#1] init rewrite engine with requested uri /subfold/dl/httpdocs__my__test.html
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace1] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b1a020/initial/redir#1] pass through /subfold/dl/httpdocs__my__test.html
[Mon Oct 18 15:21:55.008369 2021] [setenvif:trace2] [pid 15508:tid 1936] mod_setenvif.c(630): [client 127.0.0.1:12206] Setting REWRGO
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace3] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b1a020/initial/redir#1] [perdir /subfold/dl/] add path info postfix: C:/bin/Apache24/htdocs/subfold -> C:/bin/Apache24/htdocs/subfold/dl/httpdocs__my__test.html
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace3] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b1a020/initial/redir#1] [perdir /subfold/dl/] applying pattern 'subfold/dl/(.*)/(.*)$' to uri 'C:/bin/Apache24/htdocs/subfold/dl/httpdocs__my__test.html'
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace3] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b1a020/initial/redir#1] [perdir /subfold/dl/] add path info postfix: C:/bin/Apache24/htdocs/subfold -> C:/bin/Apache24/htdocs/subfold/dl/httpdocs__my__test.html
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace3] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b1a020/initial/redir#1] [perdir /subfold/dl/] applying pattern '(.*)' to uri 'C:/bin/Apache24/htdocs/subfold/dl/httpdocs__my__test.html'
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace5] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b1a020/initial/redir#1] setting env variable 'RSLT' to 'C:/bin/Apache24/htdocs/subfold/dl/httpdocs__my__test.html'
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace5] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b1a020/initial/redir#1] skipping already set cookie 'frontdoor'
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace2] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b1a020/initial/redir#1] [perdir /subfold/dl/] forcing 'C:/bin/Apache24/htdocs/subfold' to get passed through to next API URI-to-filename handler
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace1] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b1a020/initial/redir#1] [perdir /subfold/dl/] initial URL equal rewritten URL: C:/bin/Apache24/htdocs/subfold [IGNORING REWRITE]

Finally, note:

  • Now that our RewriteRule does not insist on ^start of string, the cookie value does not contain the starting part of the absolute path as before
  • However, we still have the absolute local (C:/...) path for RSLT - and that persist, even if we add [P] (for proxy) or [R=302] (for redirect) flags to the RewriteRule (there will be no change in behavior, since the page regardless returns 404)
  • For some reason, SetEnvIf Cookie doesn't match the cookie, even if the cookie get written out
  • Even if %{REQUEST_URI} definitely exists, it does not get output in header via the "%{REQUEST_URI}e" syntax - it only gets output via the "expr=%{REQUEST_URI}" syntax, which I guess is an ap_expr
sdbbs
  • 4,270
  • 5
  • 32
  • 87