Ok, managed to get a bit further with this.
First, let's consider this httpd.conf
snippet:
<Location /subfold/dl>
RewriteEngine On
RewriteOptions AllowAnyURI
RewriteCond %{REQUEST_URI} ^/subfold/dl/
#RewriteRule subfold/dl/(.*)/(.*)$ httpdocs__$1__$2
RewriteRule (.*) - [E=RSLT:$1,CO=;frontdoor;yes_"$1";127.0.0.1;1440;/,NE,PT,L]
SetEnvIf Request_URI "^/subfold/dl/" REWRGO=Yes
# "without 'always' your additions will only go out on succesful responses"
SetEnvIf Cookie "frontdoor=([^;]+)" MyFrontDoor=%1
Header always set X-REWRGO "%{REWRGO}e"
Header always set X-RSLT "%{RSLT}e"
Header always set X-FRONTDOOR "%{MyFrontDoor}e"
Header always set X-Request_URI1 "%{REQUEST_URI}e"
Header always set X-Request_URI2 "expr=%{REQUEST_URI}"
</Location>
When I have this in my httpd.conf
, and I start Apache httpd.exe
, and I issue a request, this is what I get now:
$ curl -IkL http://127.0.0.1/subfold/dl/my/test.html
HTTP/1.1 404 Not Found
Date: Mon, 18 Oct 2021 12:35:36 GMT
Server: Apache/2.4.46 (Win32) OpenSSL/1.1.1j
X-REWRGO: Yes
X-RSLT: C:/bin/Apache24/htdocs/subfold/dl/my/test.html
X-FRONTDOOR: (null)
X-Request_URI1: (null)
X-Request_URI2: /subfold/dl/my/test.html
Set-Cookie: frontdoor=yes_"C:/bin/Apache24/htdocs/subfold/dl/my/test.html"; path=/; domain=127.0.0.1; expires=Tue, 19-Oct-2021 12:35:36 GMT
Content-Type: text/html; charset=iso-8859-1
So, something started working ... and what I gather from this:
First, the AllowAnyURI
is not really needed here - I thought it was because https://httpd.apache.org/docs/2.4/mod/mod_rewrite.html says:
When RewriteRule is used in VirtualHost or server context with version 2.2.22 or later of httpd, mod_rewrite will only process the rewrite rules if the request URI is a URL-path.
I was afraid that my request URI might not be recognized, so I enabled AllowAnyURI
- but that does not seem to be the issue here.
So here is what this tests: with RewriteRule (.*) -
we basically have a "passthrough" - the original request does not get rewritten; again https://httpd.apache.org/docs/2.4/mod/mod_rewrite.html :
The Substitution may be a: ...
- (dash)
A dash indicates that no substitution should be performed (the existing path is passed through untouched). This is used when a flag (see below) needs to be applied without changing the path.
... and we also have the corresponding [PT] flag there. Then, since the entire match request is in parentheses (.*)
and therefore captured, we can output it with $1
in the same line; and finally, we both set an environment variable RSLT to this, as well as a cookie.
Now here is where I got surprised: what ends up in the "Pattern" part of the rewrite rule is not the original URL - it is what Apache would see as the translation of this URL to the local filesystem!!!
I first noticed this, when I got split values for the cookie, as it always splits on colons (:
) - even if it is string data in a variable (which however did contain C:/..
, which is what end up being split)! Hence I tried to use double quotes, but that did not work - what worked eventually was https://httpd.apache.org/docs/2.4/rewrite/flags.html#flag_co :
If a literal ':' character is needed in any of the cookie fields, an alternate syntax is available. To opt-in to the alternate syntax, the cookie "Name" should be preceded with a ';' character, and field separators should be specified as ';'.
In any case, here we've had:
http://127.0.0.1
/subfold/dl/my/test.html
- original request
C:/bin/Apache24/htdocs
/subfold/dl/my/test.html
- what Apache mapped that request to, in the local filesystem
And essentially, this was the problem with the code in the OP: RewriteRule ^subfold/dl/(.*)/(.*)$ ...
would mean: replace the string that starts with subfold...
- however, the string ``C:/bin/...` definitely does not start like that, and so the RewriteRule never matches, and does not execute!
I guess, another confusion for me was, that (.*)
as the Pattern (first arg) in RewriteRule, typically looks like the REQUEST_URI - but that is within .htaccess in a directory; https://httpd.apache.org/docs/2.4/mod/mod_rewrite.html again:
In VirtualHost context, The Pattern will initially be matched against the part of the URL after the hostname and port, and before the query string (e.g. "/app1/index.html"). This is the (%-decoded) URL-path.
In per-directory context (Directory and .htaccess), the Pattern is matched against only a partial path, for example a request of "/app1/index.html" may result in comparison against "app1/index.html" or "index.html" depending on where the RewriteRule is defined.
... where
REQUEST_URI
The path component of the requested URI, such as "/index.html". This notably excludes the query string which is available as its own variable named QUERY_STRING
So, my bad was, that I was assuming (.*)
as Pattern in the RewriteRule would be matched to the path component (REQUEST_URI-like) and I attempted matching versus that; whereas, in this case, it was actually the full filesystem path, starting with C:/...
!
The being said, we can now uncomment the line RewriteRule subfold/dl/(.*)/(.*)$ httpdocs__$1__$2
in the above snippet - realising that this time, the match does not have the starting caret (^
), which would means "match the start of string":
<Location /subfold/dl>
RewriteEngine On
RewriteCond %{REQUEST_URI} ^/subfold/dl/
RewriteRule subfold/dl/(.*)/(.*)$ httpdocs__$1__$2
RewriteRule (.*) - [E=RSLT:$1,CO=;frontdoor;yes_"$1";127.0.0.1;1440;/,NE,PT,L]
SetEnvIf Request_URI "^/subfold/dl/" REWRGO=Yes
# "without 'always' your additions will only go out on succesful responses"
SetEnvIf Cookie "frontdoor=([^;]+)" MyFrontDoor=%1
Header always set X-REWRGO "%{REWRGO}e"
Header always set X-RSLT "%{RSLT}e"
Header always set X-FRONTDOOR "%{MyFrontDoor}e"
Header always set X-Request_URI1 "%{REQUEST_URI}e"
Header always set X-Request_URI2 "expr=%{REQUEST_URI}"
</Location>
... and the response to this is:
$ curl -IkL http://127.0.0.1/subfold/dl/my/test.html
HTTP/1.1 404 Not Found
Date: Mon, 18 Oct 2021 13:18:33 GMT
Server: Apache/2.4.46 (Win32) OpenSSL/1.1.1j
X-REWRGO: Yes
X-RSLT: C:/bin/Apache24/htdocs/subfold/dl/httpdocs__my__test.html
X-FRONTDOOR: (null)
X-Request_URI1: (null)
X-Request_URI2: /subfold/dl/httpdocs__my__test.html
Set-Cookie: frontdoor=yes_"httpdocs__my__test.html/dl/my/test.html"; path=/; domain=127.0.0.1; expires=Tue, 19-Oct-2021 13:18:33 GMT
Content-Type: text/html; charset=iso-8859-1
So, yeah - finally I can see the expected mod_rewrite string substitution/replacement in headers, even if the rewritten URI is incorrect (i.e. does not correctly map to existing local filesystem data) -- which is what I wanted to achieve.
For fun, this is what comes out of the Apache error (that is, mod_rewrite) log:
[Mon Oct 18 15:21:55.007369 2021] [rewrite:trace2] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] init rewrite engine with requested uri /subfold/dl/my/test.html
[Mon Oct 18 15:21:55.007369 2021] [rewrite:trace1] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] pass through /subfold/dl/my/test.html
[Mon Oct 18 15:21:55.008369 2021] [setenvif:trace2] [pid 15508:tid 1936] mod_setenvif.c(630): [client 127.0.0.1:12206] Setting REWRGO
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace3] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] [perdir /subfold/dl/] add path info postfix: C:/bin/Apache24/htdocs/subfold -> C:/bin/Apache24/htdocs/subfold/dl/my/test.html
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace3] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] [perdir /subfold/dl/] applying pattern 'subfold/dl/(.*)/(.*)$' to uri 'C:/bin/Apache24/htdocs/subfold/dl/my/test.html'
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace4] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] [perdir /subfold/dl/] RewriteCond: input='/subfold/dl/my/test.html' pattern='^/subfold/dl/' => matched
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace2] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] [perdir /subfold/dl/] rewrite 'C:/bin/Apache24/htdocs/subfold/dl/my/test.html' -> 'httpdocs__my__test.html'
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace3] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] [perdir /subfold/dl/] add per-dir prefix: httpdocs__my__test.html -> /subfold/dl/httpdocs__my__test.html
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace3] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] [perdir /subfold/dl/] add path info postfix: /subfold/dl/httpdocs__my__test.html -> /subfold/dl/httpdocs__my__test.html/dl/my/test.html
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace3] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] [perdir /subfold/dl/] strip per-dir prefix: /subfold/dl/httpdocs__my__test.html/dl/my/test.html -> httpdocs__my__test.html/dl/my/test.html
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace3] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] [perdir /subfold/dl/] applying pattern '(.*)' to uri 'httpdocs__my__test.html/dl/my/test.html'
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace5] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] setting env variable 'RSLT' to 'httpdocs__my__test.html/dl/my/test.html'
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace5] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] setting cookie 'frontdoor=yes_"httpdocs__my__test.html/dl/my/test.html"; path=/; domain=127.0.0.1; expires=Tue, 19-Oct-2021 13:21:55 GMT'
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace2] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] [perdir /subfold/dl/] forcing '/subfold/dl/httpdocs__my__test.html' to get passed through to next API URI-to-filename handler
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace2] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] [perdir /subfold/dl/] trying to replace context docroot C:/bin/Apache24/htdocs with context prefix
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace1] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b22d90/initial] [perdir /subfold/dl/] internal redirect with /subfold/dl/httpdocs__my__test.html [INTERNAL REDIRECT]
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace2] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b1a020/initial/redir#1] init rewrite engine with requested uri /subfold/dl/httpdocs__my__test.html
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace1] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b1a020/initial/redir#1] pass through /subfold/dl/httpdocs__my__test.html
[Mon Oct 18 15:21:55.008369 2021] [setenvif:trace2] [pid 15508:tid 1936] mod_setenvif.c(630): [client 127.0.0.1:12206] Setting REWRGO
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace3] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b1a020/initial/redir#1] [perdir /subfold/dl/] add path info postfix: C:/bin/Apache24/htdocs/subfold -> C:/bin/Apache24/htdocs/subfold/dl/httpdocs__my__test.html
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace3] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b1a020/initial/redir#1] [perdir /subfold/dl/] applying pattern 'subfold/dl/(.*)/(.*)$' to uri 'C:/bin/Apache24/htdocs/subfold/dl/httpdocs__my__test.html'
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace3] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b1a020/initial/redir#1] [perdir /subfold/dl/] add path info postfix: C:/bin/Apache24/htdocs/subfold -> C:/bin/Apache24/htdocs/subfold/dl/httpdocs__my__test.html
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace3] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b1a020/initial/redir#1] [perdir /subfold/dl/] applying pattern '(.*)' to uri 'C:/bin/Apache24/htdocs/subfold/dl/httpdocs__my__test.html'
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace5] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b1a020/initial/redir#1] setting env variable 'RSLT' to 'C:/bin/Apache24/htdocs/subfold/dl/httpdocs__my__test.html'
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace5] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b1a020/initial/redir#1] skipping already set cookie 'frontdoor'
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace2] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b1a020/initial/redir#1] [perdir /subfold/dl/] forcing 'C:/bin/Apache24/htdocs/subfold' to get passed through to next API URI-to-filename handler
[Mon Oct 18 15:21:55.008369 2021] [rewrite:trace1] [pid 15508:tid 1936] mod_rewrite.c(483): [client 127.0.0.1:12206] 127.0.0.1 - - [127.0.0.1/sid#f88ee0][rid#6b1a020/initial/redir#1] [perdir /subfold/dl/] initial URL equal rewritten URL: C:/bin/Apache24/htdocs/subfold [IGNORING REWRITE]
Finally, note:
- Now that our RewriteRule does not insist on ^start of string, the cookie value does not contain the starting part of the absolute path as before
- However, we still have the absolute local (
C:/...
) path for RSLT - and that persist, even if we add [P] (for proxy) or [R=302] (for redirect) flags to the RewriteRule (there will be no change in behavior, since the page regardless returns 404)
- For some reason,
SetEnvIf Cookie
doesn't match the cookie, even if the cookie get written out
- Even if
%{REQUEST_URI}
definitely exists, it does not get output in header via the "%{REQUEST_URI}e"
syntax - it only gets output via the "expr=%{REQUEST_URI}"
syntax, which I guess is an ap_expr