-1

My situation is very similar to the one in this question (in fact, the code is very similar). I've been trying to create a .htaccess file to use URLs without file extensions so that e.g. https://example.com/file finds file.html in the appropriate directory, but also that https://example.com/file.html redirects (using a HTTP redirect) to https://example.com/file so there is only one canonical URL. With the following .htaccess:

Options +MultiViews
RewriteEngine On

# Redirect <...>.php, <...>.html to <...> (without file extension)
RewriteRule ^(.+)\.(php|html)$ /$1 [L,R]

I've been running into a redirect loop just as in the question mentioned above. (In my case, finding the corresponding file is achieved by MultiViews instead of a separate RewriteRule.)

However, with a solution adopted from this answer:

Options +MultiViews
RewriteEngine On

# Redirect <...>.php, <...>.html to <...> (without file extension)
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s(.+)\.(php|html)
RewriteRule ^ %1 [L,R]

there is no redirect loop. I’d be interested to find out where the difference comes from. Aren’t both files functionally equivalent? How come that using a “normal” RewriteRule creates a loop, while using %{THE_REQUEST} doesn’t?

Note that I’m not looking for a way to get clean URLs (I could just use the second version of my file or the answer to the question linked above, which looks at %{ENV:REDIRECT_STATUS}), but for the reason why these two approaches work/don’t work, so this is not the same question as the one linked above.

Note: I'm seeing the same problem using only mod_rewrite (without MultiViews), so it doesn't seem to be due to the order of execution of MultiViews and mod_rewrite:

Options -MultiViews
RewriteEngine On

## Redirect <...>.php, <...>.html to <...> (without file extension)
# This works...
RewriteCond %{THE_REQUEST} ^[A-Z]{3,}\s(.+)\.(php|html)
RewriteRule ^ %1 [L,R]
# But this doesn’t!
#RewriteRule ^(.+)\.(php|html)$ /$1 [L,R]

# Find file with file extension .php or .html on the filesystem for a URL
# without file extension
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME}\.php -f
RewriteRule ^ %{REQUEST_FILENAME}.php [L]
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME}\.html -f
RewriteRule ^ %{REQUEST_FILENAME}.html [L]

Where’s the difference? I would expect both approaches to work because the internal rewrite to a file is at the very end of the .htaccess with an [L] flag, so there shouldn't be any processing or redirecting happening afterwards, right?

hjpotter92
  • 78,589
  • 36
  • 144
  • 183
Socob
  • 1,189
  • 1
  • 12
  • 26

2 Answers2

1
# But this doesn’t!
#RewriteRule ^(.+)\.(php|html)$ /$1 [L,R]

Reason why this commented rule doesn't work and causes rewrite loop because your other rule is adding .html extension and changing %{REQUEST_URI} variable to /file.html thus causing this rule to execute again. And taking out .html from rule causes other rule to fire again. This goes on until max recursion limit is reached.

You also need to understand that mod_rewrite runs in a loop until a rule doesn't match. Since both rules keep firing therefore mod_rewrite keeps looping.

Reason why rule based on THE_REQUEST works because THE_REQUEST variable doesn't get overwritten after execution of other rewrite rules.

anubhava
  • 761,203
  • 64
  • 569
  • 643
  • But shouldn't the `[L]` flag stop the mod_rewrite loop? Why does it still keep going after e.g. `RewriteRule ^ %{REQUEST_FILENAME}.html [L]`? – Socob Nov 23 '15 at 09:03
  • No `L` flag doesn't stop the loop. `L` flag merely ends the current rule and forces `mod_rewrite` to run the loop again. – anubhava Nov 23 '15 at 11:11
  • By “current rule”, do you mean “current pass of the `.htaccess` file”? Otherwise the flag would be kind of useless... – Socob Nov 23 '15 at 12:55
  • 1
    Yes that is current. It ends the current `RewriteRule` of where it is used and then acts as `continue` for the mod_rewrite loop. Also note that if you use `END` instead (available with Apache 2.4+) then that will definitely terminate the `mod_rewrite` loop. – anubhava Nov 23 '15 at 13:37
  • 2
    [Also see the flow chart in this document](http://httpd.apache.org/docs/current/rewrite/tech.html) – anubhava Nov 23 '15 at 14:55
1

If you look at RewriteRule directive's documentation, you'll notice the following:

On the first RewriteRule, it is matched against the (%-decoded) URL-path of the request, or, in per-directory context (see below), the URL path relative to that per-directory context. Subsequent patterns are matched against the output of the last matching RewriteRule.

Since, it will be matched on a per directory basis, once you put the following:

RewriteRule ^(.+)\.(php|html)$ /$1 [L,R]

the REQUEST_URI variable changes, and mod-rewrite parses the URI again. This leads to MultiViews rewriting the URL to the proper file matching this redirected URL and causing a loop (URI changes on every rewrite).

Now, when you put THE_REQUEST variable to match against, the URI may change on internal rewrites, but the actual request as received by the server would never change unless a redirect is performed.

hjpotter92
  • 78,589
  • 36
  • 144
  • 183
  • Same question here as for the other answer: Shouldn't the `[L]` flag stop the mod_rewrite loop (in my last example)? Why does it still keep going after e.g. `RewriteRule ^ %{REQUEST_FILENAME}.html [L]`? Even if other rules would still match, shouldn't it stop at `[L]`? – Socob Nov 23 '15 at 09:09
  • @Socob The [`L` flag documentation](https://devdocs.io/apache_http_server/rewrite/flags#flag_l) has: "_It is therefore important, if you are using `RewriteRule` directives in one of these contexts, that you take explicit steps to avoid rules looping, and **not count solely on the `[L]` flag to terminate execution of a series of rules**, as shown below._". – hjpotter92 Nov 23 '15 at 09:23
  • Yes, I know that part of the documentation. What I’m trying to understand is _why_ `[L]` doesn’t help me in this case. Is it because of the following part? “It is possible that as the rewritten request is handled, the `.htaccess` file or `` section may be encountered again, and thus the ruleset may be run again from the start. Most commonly this will happen if one of the rules causes a redirect - either internal or external - causing the request process to start over.” – Socob Nov 23 '15 at 12:51
  • @Socob ah, yes. I removed the wrong paragraph when I pasted text in previous comment. – hjpotter92 Nov 23 '15 at 12:56
  • @Socob Once `L` flag is encountered, mod-rewrite does whatever is asked of it. Now, if you have an internal redirect applied, the `REQUEST_URI` gets updated, and the .htaccess is processed from the start again. If you do not have `L` flag, mod-rewrite will continue to match rules until end of file and then start from the beginning once more, if there were successful `REQUEST_URI` changes.. – hjpotter92 Nov 23 '15 at 12:58