0

Just want to confirm something. From what I gather of how mod_rewrite works, Apache receives an URL and immediately mod_rewrite applies (non-<directory>) rules in httpd.conf, then per-directory mod-rewriting goes to work, then restarts the process with a new URL if any changes are made. @JonLin's great answer to this question first says that when your per-directory rule specs an absolute replacement (ie. starting with a slash), it's assumed to be relative to the DocumentRoot which I get. But of relative replacements (no slash) Jon then says:

it's based on the directory that the rule is in. So if

RewriteRule ^foo$ bar.php [L]

is in the "root" and you go to http://example.com/foo, you get served http://example.com/bar.php. But if that rule is in the "subdir1" directory, and you go to http://example.com/subdir1/foo, you get served http://example.com/subdir1/bar.php. etc. This sometimes works and sometimes doesn't, as the documentation says, it's supposed to be required for relative paths, but most of the time it seems to work. Except when you are redirecting (using the R flag, or implicitly because you have http://host in your rule's target). That means this rule:

RewriteRule ^foo$ bar.php [L,R]

if it's in the "subdir2" directory, and you go to http://example.com/subdir2/foo, mod_rewrite will mistake the relative path as a file-path instead of a URL-path and because of the R flag, you'll end up getting redirected to something like: http://example.com/var/www/localhost/htdocs/subdir1.

As Jon explains in the last bit, when a redirect will occur and when there's no rewriteBase, a string intended as filepath gets appended to the site's base address to create a phony URL. But just to confirm, even in the former case Jon mentions, ie. not an actual redirect, the substituted string does get sent back to Apache's URL-reception code, restarting the whole process, correct? The diagram on this page of the spec seems to imply that until no rules make a change, the process keeps restarting. These non-redirect cases would seem to be the time when it WOULD make sense to tack the filepath right from the file system root to the htaccess directory onto the beginning of the substitution. But how does that get turned into a proper URL as expected by the URL-reception code - does http://localhost get prepended? I think that would make everything relative to the documentroot, not the actual file system root.

Thanks!

Community
  • 1
  • 1

1 Answers1

0

Been doing some more reading and think I've got this explained, for anyone who's interested. Regarding my question about how a file system absolute path gets turned into a valid url for the internal redirect, I was thinking that the URI in an HTTP request contained "http://hostname", but this has been cut off ie. the URI is like /this/is/a/path. The host name is in a separate "Host" header field, and is no longer a vital piece of information by the time mod_rewrite is running, as Apache's initial Post Read Request phase has already noticed the GET request on the port and, if Name-Based Virtual Hosting is in use, interpreted things like the DocumentRoot from the Host header field, and finally called the URI Translation Phase where mod_rewrite executes. So any time mod_rewrite is running, there could be only one host name that got us here.

So to summarize, what I had called the "URL-reception" part of Apache always deals with /paths/like/this/without/hostname, not just after internal redirects. The spec does say that rewriteCond/rewriteRule match against such paths, but I figured the host name was there initially and got removed. So then all that's left is to ensure our rules are prepared for cases where they are running in an internal redirect spawned by an earlier runthrough of themselves, and not do something inadvertent when they see a file system absolute path caused by a replacement that didn't start with a slash. What a mouthful.