3

In order to remove index.html or index.htm from urls I use the following in my .htaccess

RewriteCond %{REQUEST_URI} /index\.html?$ [NC]
RewriteRule ^(.*)index\.html?$ "/$1" [NC,R=301,NE,L]

This works! (More info about flags at the end of this question *)

Then in order to add www in urls I use the following in my .htaccess

RewriteCond %{HTTP_HOST} !^www\.mydomain\.com$ [NC]
RewriteRule ^(.*)$ "http://www.mydomain.com/$1" [R=301,NE,L]

This works too!

The question here is how to avoid the double redirection created by rules above in cases like the one below:

  1. browsers asks for http://mydomain.com/path/index.html
  2. server sends 301 header to redircet browser to http://mydomain.com/path/
  3. then browser requests http://mydomain.com/path/
  4. now the server sends 301 header to redircet browser to http://www.mydomain.com/path/

This is obviously not very smart cause a poor user who is asking http://mydomain.com/path/index.html would be double redirected, and he would feel page goes too slow. Moreover Googlebot might stop following the link cause to the double redircetion (I'm not sure on this last one and I don't want to get into a discussion on this, it's just another possible issue.)

Thanks!


*To whom it might be interested:

  • NC is used to redirect also uppercased files i.e. INDEX.HTML / InDeX.HtM
  • NE is used to avoid double url encoding I avoid http://.../index.html?hello=ba%20be to be redirected to http://.../index.html?hello=ba%2520be
  • QSA is used to redirect also queries, i.e. http://.../index.html?hello=babe to http://.../?hello=babe (not needed thanks to anubhava answer)
anubhava
  • 761,203
  • 64
  • 569
  • 643
Marco Demaio
  • 33,578
  • 33
  • 128
  • 159
  • [Answer in near-duplicate](http://stackoverflow.com/questions/5607001/using-htaccess-to-redirect-domain-co-uk-index-html-to-www-domain-co-uk). To be fair the other question doesn't mandate the use of one redirect per many rules, but the answer is correct anyway. – Core Xii May 19 '11 at 15:14
  • @Cori Xii: I read that questions/answer before asking mine, the rules there work the same of mine, but as you said this question is about how to avoid the double redirect, the other question does not mind to do a double redirect. – Marco Demaio May 19 '11 at 15:24
  • 1
    The other _question_ doesn't mind, but the accepted _answer_ does what you're asking regardless, does it not? – Core Xii May 19 '11 at 15:31
  • @Core Xii: I tested the answer you suggested http://stackoverflow.com/questions/5607001/using-htaccess-to-redirect-domain-co-uk-index-html-to-www-domain-co-uk/5610979#5610979, it works for that question, but it still performs a double 301 redirect, so it does not solve my question. – Marco Demaio May 19 '11 at 15:50

3 Answers3

6

To avoid double redirection have another rule in .htaccess file that meets both conditions like this:

Options +FollowSymlinks -MultiViews
RewriteEngine on

RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteCond %{REQUEST_URI} ^(.*/)index\.html$ [NC]
RewriteRule . http://www.%{HTTP_HOST}%1 [R=301,NE,L]

RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteRule . http://www.%{HTTP_HOST}%{REQUEST_URI} [NE,R=301,L]

RewriteCond %{REQUEST_URI} ^(.*/)index\.html$ [NC]
RewriteRule . %1 [R=301,NE,L]

So if input URL is http://mydomain.com/path/index.html then both the conditions get satisfied in the first rule here and there will be 1 single redirect (301) to http://www.mydomain.com/path/.

Also I believe QSA flag is not really needed above since you are NOT manipulating query string.

anubhava
  • 761,203
  • 64
  • 569
  • 643
  • so basically i need to merge the tow RewriteRule/RewriteCond in one. Interesting, I thought there would be an easier way. BTW I think the `/?` is pointless in `(.*/?)` because before you have `.*` that matches any char. About the `QSA` I think you are right, it's useless, I updated the question. – Marco Demaio May 19 '11 at 19:10
  • I also adedd `RewriteCond %{REQUEST_URI} /index\.html?$ [NC]` in my question before the index.html rewrite rule otherwise `http://.../pathindex.html` (no slash beteween `path` and `index.html`) would be redirected too to `http://.../path`. – Marco Demaio May 19 '11 at 19:15
  • @Marco Demaio: I made some minor edits in my answers above to catch one issue with `(.*/?)`. Earlier if you had a URI of `/myindex.html` that was also getting redirected to `/my` and we certainly don't let that happen so now I am capturing URI with leading slash from %{REQUEST_URI} variable and using that on RHS. Please try it 1 more time. – anubhava May 19 '11 at 19:38
  • I think you forgot a slash in `RewriteCond %{REQUEST_URI} ^(.*/)index.html$ [NC]` this condition satisfy also `http://.../path/index0html` cause the `.` in the regexp matched any character, it should be `RewriteCond %{REQUEST_URI} ^(.*/)index\.html$ [NC]` Anyway i got the idea of your solution form your first paragraph: `...have another rule in .htaccess file that meets both conditions` – Marco Demaio May 20 '11 at 15:24
  • The problem for me with this example is `http://example.com/index.html` redirects to `http://example.com/` instead of to `http://www.example.com/` which would fix both problems at once. – Mattypants Jul 30 '15 at 18:35
2

A better solution would be to place the index.html rule ahead of the www rule and inside the index.html rule ADD the www prefix to the destination url. This way someone looking for http://domain.com/index.html would get sent to http://www.domain.com/ by the FIRST rule. The second (www) rule would then only apply if index AND www are missing, which is again only one redirect.

stuff
  • 21
  • 1
  • Sorry, but I'm missing your point, anubhava answer (http://stackoverflow.com/a/6062534/260080) already does ONE AND ONLY ONE redirect. – Marco Demaio Dec 09 '12 at 14:44
-1

Remove the L flag from the prior rule? L forces the rule parsing to stop (when the rule is matched) and thus send the first rewritten URL without applying the second rule.

The rules are applied sequentially from top to bottom, each rewriting the URL again if it matches the rule's conditions and pattern.

RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteRule ^ http://www.%{HTTP_HOST}%{REQUEST_URI} [R=301]

RewriteRule ^(.*/)index\.html?$ $1 [NC,QSA,R=301,NE,L]

Hence the above will first add the www and then remove the index.html?, before sending the new URL; A single redirect for all the rules.

Core Xii
  • 6,270
  • 4
  • 31
  • 42
  • Sorry, but it does NOT work! I tried to remove `L` before and I also tried again now just in case. If I remove the `L`¸ when user goes to `http://domain.com/index.html` he gets redircetd to `http://domain.com/http://www.domain.com/` (and I did not write the url twice by mistake, it's exaclty the 301 header that is being sent out by server) – Marco Demaio May 19 '11 at 14:37
  • I can't get your rules to work either, and I'm tired of wrestling with this thing. Maybe it doesn't like being placed in a subdirectory, I don't know. – Core Xii May 19 '11 at 15:10
  • since you said my rules do not work, you can try with the rules of the question http://stackoverflow.com/questions/5607001/using-htaccess-to-redirect-domain-co-uk-index-html-to-www-domain-co-uk still I would like to avoid the double redirection. – Marco Demaio May 19 '11 at 15:25
  • Ok, I got it to work again, was probably some browser cache issue. Updated my answer. The rules now _almost_ work, except for one case: `http://www.comain.com/index.html` doesn't remove the `index.html` for some reason. – Core Xii May 19 '11 at 15:34