3

I currently have a website where guests are able to access each url with any number of slashes to separate folder names. For example, if a URL is supposed to be:

http://example.com/one/two/three/four

Then users could access the same page via any of the following:

http://example.com/one//two///three////four/////
http://example.com/one/two////three/four/////
http://example.com///one///////////two////three/four/
http://example.com///////////one///////////two/three/four

However, I want the above example urls to only redirect users to this URL:

http://example.com/one/two/three/four

This is my .htaccess file to attempt to stop the enormous slashes:

RewriteCond %{ENV:REDIRECT_STATUS} !^$
RewriteRule .* - [L]
RewriteRule ^(.*)/+$ /$1 [R=301,L,NC]
RewriteCond %{REQUEST_URI} ^/+(.*)/+$
RewriteRule .* /%1 [R=301,L]

The third line successfully stops trailing slashes on long URLs. The 4th and 5th lines are my attempt to stop trailing slashes right after the domain name, but that was unsuccessful.

The reason why I ask this question is because I don't want google to catch me for duplicate content and with adsense active on the site, google will likely scan all the URLs that I access.

Is there a RewriteCond/RewriteRule combo I can use to strip the middle slashes or is it more involved?

anubhava
  • 761,203
  • 64
  • 569
  • 643
Mike -- No longer here
  • 2,064
  • 1
  • 15
  • 37

2 Answers2

20

You can use this rule for removing multiple slashes anywhere in URL except query string:

RewriteCond %{THE_REQUEST} \s[^?]*//
RewriteRule ^.*$ /$0 [R=302,L,NE]
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • 2
    e.g. `https://localhost///////////one///////////two//////////three/four///////` will become `https://localhost/one/two/three/four/` in a single redirect. – anubhava Aug 11 '15 at 06:20
  • 1
    Definitely works, @Mike you should accept this answer. Thanks anubhava. – DeDee Nov 15 '15 at 18:40
  • It will work with any number of slashes more than one – anubhava Jan 31 '20 at 06:12
  • Is this an undocumented behavior? Because Apache HTTP Server documentation states: `$0 provides access to the whole string matched by that pattern.` It does not mention that it will auto squeeze all `//` into single `/`. – Maris B. Nov 20 '20 at 09:31
  • 1
    @MarisB. It is not always easy to find documentation reference for all the features. Documentation is scattered in multiple places but this is an established fact that multiple `//` get converted to single `/` in the pattern of `RewriteRule` – anubhava Nov 20 '20 at 09:34
  • 2
    @MarisB. It's not the `$0` backreference itself that reduces multiple slashes. As you state, `$0` simply contains the whole string matched by the pattern. In a _directory_ (or `.htaccess`) context, the `RewriteRule` _pattern_ matches against the URL-path _after_ it has been mapped to the filesystem. It is the process of mapping the request to the filesystem that reduces multiple slashes. Conversely, the same directive does not work if used in a _server_ or _virtualhost_ context - which is processed _before_ the request is mapped to the filesystem (when multiple slashes have not been reduced). – MrWhite Dec 07 '20 at 17:12
  • @anubhava, can you please explain why 302 and not 301 ? Who would want to remove slashes just temporary ? – Andrei Jan 20 '21 at 20:58
  • 302 is only for testing so that you don't have to clear browser cache while testing. Once tested it should be `301` only. – anubhava Jan 20 '21 at 21:10
1

This works for me:

RewriteCond %{REQUEST_URI} ^(.*)//(.*)$
RewriteRule . %1/%2 [R=301,L]
weblegko
  • 11
  • 2