OK, got somewhere (also see somewhat related post Debugging Apache2 RewriteRule (with headers)?)
So, the only thing I could find somewhat working is Using RewriteMap - Apache HTTP Server Version 2.4. Now, since I use a Windows build of Apache, it gets somewhat extra tricky - but is still doable.
There is very skint information online on how to get this working; crucial information was found here RewriteMap prg: issue on Windows - Apache Web Server forum at WebmasterWorld - By Pubcon:
RewriteMap program is kicked off IFF the "RewriteEngine On" directive is OUTSIDE as below
In my case, also, the RewriteMap program starts if and only if the RewriteMap directive is OUTSIDE <Location>
; AND the "RewriteEngine On" is OUTSIDE <Location>
- in any other case, the program does not start.
Second thing we should be careful about is this, from https://httpd.apache.org/docs/2.4/rewrite/rewritemap.html :
When a MapType of prg is used, the MapSource is a filesystem path to
an executable program which will providing the mapping behavior. This
can be a compiled binary file, or a program in an interpreted language
such as Perl or Python.
This program is started once, when the Apache HTTP Server is started,
and then communicates with the rewriting engine via STDIN and STDOUT.
That is, for each map function lookup, it expects one argument via
STDIN, and should return one new-line terminated response string on
STDOUT. If there is no corresponding lookup value, the map program
should return the four-character string "NULL" to indicate this.
External rewriting programs are not started if they're defined in a
context that does not have RewriteEngine set to on.
In other words - the program used HAS to open its STDIN and STDOUT - AND it MUST block continuously; even if what you wanted to do was perl -i -pe's/SEARCH/REPLACE/'
, that kind of a program reads input, processes, provides output, and exits - and so in this case, it would not do us any good.
So, based on the example given in rewritemap.html - here is a Perl script that replaces forward slash (/
) with %2F
, while blocking continuously, called convslash.pl
, saved in C:\bin\Apache24\bin\
#!C:/msys64/usr/bin/perl.exe
$| = 1; # Turn off I/O buffering
while (<STDIN>) {
s|/|%2F|g; # Replace / with %2F
print $_;
}
Then, I add this in my httpd.conf
:
# the below starts and runs ONLY if RewriteEngine On is outside of <Location>; also a cmd.exe window is started (plus another for perl!)
#RewriteMap doprg "prg:c:/msys64/usr/bin/perl.exe c:/bin/Apache24/bin/dash2under.pl"
# the below is slightly better - only one cmd.exe window is started:
RewriteMap doprg "prg:c:/Windows/System32/cmd.exe /c start /b c:/msys64/usr/bin/perl.exe c:/bin/Apache24/bin/convslash.pl"
# we MUST have RewriteEngine On here, outside of location - otherwise the RewriteMap program will never start:
RewriteEngine On
<Location /subfold/dl>
Options -Multiviews
RewriteEngine On
RewriteOptions Inherit
# first RewriteCond - this is just so we can capture the relevant parts into environment variables:
RewriteCond %{REQUEST_URI} ^/subfold/dl/(.*)/(.*)$
RewriteRule ^ - [E=ONE:%1,E=TWO:%2,NE]
# the above RewriteRule does not rewrite - but passes the input string further;
# so here, let's have another such RewriteRule - just so we can set our processed/desired output to a variable, which we can "print" via headers:
RewriteRule ^ - [E=MODDED:subfold/dl/${doprg:%{ENV:ONE}}/%{ENV:TWO},NE]
# the original URL will finally pass through unmodified to the "file handler" which will attempt to map it to the filesystem, it will fail, and return 404.
# the below headers should be returned along with that 404:
Header always set X-ONE "%{ONE}e"
Header always set X-TWO "%{TWO}e"
Header always set X-INPUT "%{INPUT}e"
Header always set X-MODDED "%{MODDED}e"
Header always set X-REQ "expr=%{REQUEST_URI}"
</Location>
So, now I start the server locally (./bin/httpd.exe
), and to test this, I issue a request with curl:
$ curl -IkL http://127.0.0.1/subfold/dl/my/spec/test.html
HTTP/1.1 404 Not Found
Date: Mon, 18 Oct 2021 17:08:11 GMT
Server: Apache/2.4.46 (Win32) OpenSSL/1.1.1j
X-ONE: my/spec
X-TWO: test.html
X-INPUT: (null)
X-MODDED: subfold/dl/my%2Fspec/test.html
X-REQ: /subfold/dl/my/spec/test.html
Content-Type: text/html; charset=iso-8859-1
... and finally, we can see in the X-MODDED header, that indeed we managed to replace only a substring in (what would be) the rewritten URL ...
Well, I wish that this was documented somehow, and I didn't have to waste like 8 hours of my life to figure this out - but who cares, in couple of years there will be new servers du jour, where all of this will be irrelevant, so more time will have to be wasted - all of it to serve more crap, ads and espionage.