0

My goal was to prevent the user from having to type in .html in order to access the page they are looking for on our site. On other sites I have left the file name as /pagename.html and the user could type in only /pagename and the page would load. For some reason, that was not possible with our server settings (GoDaddy Plesk Parallel server) so my workaround was to create a folder for every page I wanted and the actual file would be /index.html. My goal was accomplished and now the user doesn't have to include .html to load the page. The problem now is that Google and SEOmoz reports are reading tons of duplicate content. The reason is that the user could type in 3 different things to get to the same page - technically 6 if you include "www":

sitename.com/services
sitename.com/services/
sitename.com/services/index.html

Search engines are displaying it the 2nd way (http://sitename.com/services/) and if you type it without the "/" it redirects to showing it with the "/". SEOmoz is saying I have 301 redirects for each page in order for that to happen but we never manually did that.

I've tried creating an .htaccess file with redirects from sitename.com/services/ to sitename.com/services but the page won't load because of too many redirects.

Did I break some big rules setting it up this way?

Please note that "sitename.com/services/" is just an example of a page and our entire site of 50 pages is set up in this nature. The actual site is http://www.logicalposition.com.

Tony
  • 287
  • 4
  • 9
  • 1
    Check out URL Rewriting. While it appears many sites have many directories, and many index.html files, they typically don't. Having many directories and files would be incredibly difficult to manage. – Sampson Jan 14 '13 at 18:39

3 Answers3

1

The preferred way is to set up your server to manage the URL handling. If you are on an Apache server, for example, you could use the following suggestion and create/change the .htaccess file to get the desired affect.

http://eisabainyo.net/weblog/2007/08/19/removing-file-extension-via-htaccess/

Jason
  • 2,280
  • 23
  • 22
0

The most straightforward way is to use Apache's .htaccess (which if I remember correctly GoDaddy allows access to, though I may be wrong) to do redirects.

See this post: https://stackoverflow.com/a/5730126/549346 (mods: possible duplicate?), which directs you to place something like the following in your .htacess file:

RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)\.html$ /$1 [L,R=301] 
Community
  • 1
  • 1
Joshua
  • 1,788
  • 1
  • 11
  • 18
0

Firstly it sounds like you haven't done basic leg work to minimize this. You need to decide do you want www.samplesite.com or just samplesite.com? Then you can very easily set this with .htaccess (see this handy tool). This will mean at most you will have three variations, not 6.

I would take @Jassons's suggestion and use URL Handling - 2 of my clients currently use GoDaddy and both of which use this method so should be fully supported.

Some more helpful links for URL Handling/htaccess rewrites (although note: setting up 301 redirects takes time, patience and careful monitoring of crawl errors on Web Master Tools, so URL Handling is preferable!)

http://net.tutsplus.com/tutorials/other/using-htaccess-files-for-pretty-urls/

Extreme example, but still relevant :) Handling several thousand redirects with .htaccess

Edit Forcing trailing slash

You can easily force the trailing slash to appear by using the Rewrite rule

RewriteEngine On
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^(.*) $1 [L]
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteRule ^(.*)$ $1/ [L,R=301]
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?category=$1

I think you have already done that in part, but what you will notice is there is a 301 redirect header sent, that means the as spiders visit your site they will update the URL to have the trailing slash - it won't be over night. You might be able to use Web Master Tools to speed things up in terms of changing the URLS.

Source: In part this website, it give's you a good explanation of how it works

Community
  • 1
  • 1
tim.baker
  • 3,109
  • 6
  • 27
  • 51
  • Thanks Tim - I have created the .htaccess file for those who exclude "www" to be sent to "www". The link that @jason sent goes over the solution but the example goes beyond the trailing slash and adds the file extension. Part of the .htaccess file I made was to redirect anyone that types in /index.html to redirect to the folder so that /index.html no longer shows. This tutorial seems to counter that (in the second part about adding the slash). My goal now is to correct the error the SEOmoz report is kicking out that says domain.com/folder/ is a different page than domain.com/folder – Tony Jan 15 '13 at 16:22