8

We have a website, unfortunately all the URLs have the .html suffix, its a Magento installation, Magento allows you to change this on the CMS, but again, unfortunately all this URLs with .html suffix have a good ranking in Google. We need to redirect to non .html.

So, consider the following scenario, we are rebuilding this site from scratch, so we have the same urls on the new site but without the .html suffix.

  • Now is: www.example.de/cool-shoes.html
  • Will be: www.example.de/cool-shoes

So www.example.de/cool-shoes.html will not exist anymore, and I've been trying a redirect with the .htaccess with no luck.

I've tried so far:

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.*\.html\ HTTP/
RewriteRule (.*)index\.html$ /$1 [R=301,L] 

and:

RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.html -f
RewriteRule ^(.*)$ $1.html

but it doesn't seem to work...any ideas?

Stephen Ostermiller
  • 23,933
  • 14
  • 88
  • 109
Kaßta
  • 251
  • 1
  • 4
  • 12

8 Answers8

12

Ok so, after some research, and failing to achieve this with a rewrite rule, the following line of code worked:

redirectMatch 301 ^(.*)\.html $1

This is quite usefull to remove any url extension and avoid broken links, hopefully helps someone in the future...

cheers!

Kaßta
  • 251
  • 1
  • 4
  • 12
  • hello, just to clear my doubt, what is this rule doing ? it is not suposed that the first parameter is the expression you will receive from request and the 2nd one the one you will send to apache ? they should be switched -> ^(.*) $1.html – alexserver Sep 26 '13 at 05:07
  • That rule is removing the `.html`. The first part is what it is matching: Any url that has a `.html`. The `$1` is what is in the parenthesis of the match: everything except the `.html`. – Stephen Ostermiller Aug 18 '16 at 09:34
  • 1
    I would add a `$` to ensure the `.html` comes at the end of the URL: `redirectMatch 301 ^(.*)\.html$ $1` – Stephen Ostermiller Aug 18 '16 at 09:49
8

This will rewrite the url like so http://example.com/page.html -> http://example.com/page

# Remove .html from url
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.html -f
RewriteRule ^(.*)$ $1.html
Joshua Pekera
  • 525
  • 5
  • 8
  • This code appears to add a `.html` not remove it. It also appears to be a rewrite rule, not a redirect rule. It would need `[R=301,L]` at the end to do any redirecting. – Stephen Ostermiller Aug 18 '16 at 09:38
4

Try adding the following to the .htaccess file in the root directory of your site redirect URLs with .html extension and remove it.

Options +FollowSymLinks -MultiViews
DirectorySlash Off

RewriteEngine On

RewriteCond %{SCRIPT_FILENAME}/ -d
RewriteCond %{SCRIPT_FILENAME}.html !-f
RewriteRule [^/]$ %{REQUEST_URI}/ [R=301,L]

RewriteCond %{ENV:REDIRECT_STATUS} ^$
RewriteRule ^(.+)\.html$ /$1 [R=301,L]

RewriteCond %{SCRIPT_FILENAME}.html -f
RewriteRule [^/]$ %{REQUEST_URI}.html [QSA,L]
Suhas
  • 61
  • 5
2

Here's the solution that worked for me.

    RewriteCond %{THE_REQUEST} \.html [NC]
    RewriteRule ^(.*)\.html$ /$1 [R=301,L]

    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME}\.html -f
    RewriteRule ^(.*)$ $1.html [L]
1

This should do the trick:

RewriteEngine On
RewriteRule ^(\w+)\.html$ /$1 [R=301,L]
ᴘᴀɴᴀʏɪᴏᴛɪs
  • 7,169
  • 9
  • 50
  • 81
  • Thanks, but nothing seems to change, I know i can just remove the prefix on the magento CMS, but goodbye google ranking...what is the "w+" for? should I use [R=301, NC] OR SOMETHING? – Kaßta Apr 20 '12 at 11:30
  • Well what this does is it takes any request ending with .html (\w+ matches any non-null strings of alphanumeric characters) and it redirects it to the same address but without the .html. I am assuming you want something else, and Vikas solution seems like a neat one – ᴘᴀɴᴀʏɪᴏᴛɪs Apr 20 '12 at 11:48
  • Thanks mate,I think your are going the right way,Vikas solution is good but as I explained above, only for cases where the URL exists, in this case this URL (.html) will be removed, and I want everyone to be redirected to the new url without the .html. Im editing the main .htaccess and added the line that you suggested on the correct place, but no redirection is being done,maybe Im doing it wrong by adding this line on the main .htacess file of the whole magento installation, Ill try to make a separate .htacess for this shop (I have a multishop installation )Thanks! – Kaßta Apr 23 '12 at 09:10
  • 1
    This rule is missing the `[R=301,L]` that tells mod rewrite to redirect rather than rewrite. `RewriteRule ^(\w+)\.html$ /$1 [R=301,L]` should work. – Stephen Ostermiller Aug 18 '16 at 09:41
1

Follow the steps, and you'll be able to remove .html from url without modifying .htaccess file.

Vikas
  • 24,082
  • 37
  • 117
  • 159
  • Thanks, that actualy works, only if your pages had the .html suffix before, but in this case, this page with .html doesnt exist. It does exist on the old site, but not in the new one, so every time somebody goes for the old ulr (www.mysite.de/thing.html) get redirected to www.mysite.de/thing, instead to 404 Not found. I think this can be achieved with htaccess, right? – Kaßta Apr 23 '12 at 09:00
  • 1
    A link only answer is not high quality. The linked content could change or became unavailable. It is also not user friendly to make people click to the answer. It would be better to at least summarize the steps here. – Stephen Ostermiller Aug 18 '16 at 09:36
0

Try this to putting in your .htaccess file Redirect permanent www.mysite.de/cool-shoes.html www.mysite.de/cool-shoes this may be helpful to you

Mufaddal
  • 5,398
  • 7
  • 42
  • 57
0

This is for URLs ending with .html /product/raspberrypi.html ---> /product/raspberrypi/ (/product/raspberrypi/index.php) the index.php is hidden. Took me awhile to figure this out. LOL...

RewriteEngine On
RewriteBase /

RewriteCond %{REQUEST_URI} \.html$
RewriteRule ^(.*)\.html$ $1 [R=301,L]

RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]

You have to use 'REQUEST_URI' and add it before index redirect rules since it could be overridden by the application. Its important to know that its URI not a filename or directory we are trying to redirect, since the file names all have index.php in the root folders(Wordpress).

rickb
  • 11
  • 2