2

I have done a lot of research about removing subfolders however cannot find away to create an .htaccess rule to remove all subfolders in my root directory, example below:

www.domain.com/dan/dan changes to www.domain.com/dan

www.domain.com/pam/pam changes to www.domain.com/pam

www.domain.com/jam/jam changes to www.domain.com/jam

The .htaccess rule should keep this pattern up through infinity without me having to add the names of the subfolders to my rule, kind of like a wildcard condition or catchall scenario.

However, there is one condition, only remove subfolder if the file has the same name as I have illustrated above in my example.

I’m on Apache 1.3.42 so will need a solution that is not for the newer versions please.

Checkout my .htaccess file below, I’ve done a lot of SEO work to it as you can see:

AddType application/x-httpd-php .html

RewriteEngine On
RewriteBase /

#non www to www

RewriteCond %{HTTP_HOST} !^www\.
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]

#removing trailing slash

RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)/$ $1 [R=301,L]

#html

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^\.]+)$ $1.html [NC,L]

#index redirect

#directory remove index.html

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html\ HTTP/
RewriteRule ^index\.html$ http://www.arkiq.com/ [R=301,L]

#directory remove index 

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\ HTTP/
RewriteRule ^index http://www.arkiq.com/ [R=301,L]

#sub-directory remove index.html

RewriteCond %{THE_REQUEST} /index\.html
RewriteRule ^(.*)/index\.html$ /$1 [R=301,L]

#sub-directory remove index

RewriteCond %{THE_REQUEST} /index
RewriteRule ^(.*)/index /$1 [R=301,L]

#remove .html

RewriteCond %{THE_REQUEST} \.html
RewriteRule ^(.*)\.html$ /$1 [R=301,L]

Let me know if you know how to forward all subfolders to their respectively named files with one rule as that would be superb.

Richard Morris
  • 31
  • 1
  • 2
  • 10

1 Answers1

2

I have no setup here to test this rule with a real installation of apache, but I am pretty sure you can achieve this by using a positive lookahead with a capture group.

RewriteRule ^(.*?)([^/]+)/(?=\2(/|$))([^/]+)/?$ /$1$4 [R,L]

What does this do? ^(.*?) will match everything before the last two slashes. If you would go to example.com/test/test, it would match exactly nothing. ([^/]+) will match the first thing we want to test and puts it in capture group 2. (?=\2(/|$)) is the positive lookahead. A lookahead will 'peek' at the next characters, but will not consume any. \2 is replaced with the second capture group and (/|$) will either match a slash or the end of the string. The last ([^/]+) will match the second 'thing' and /? will make sure that the url is matched even if a / exists at the end of the url. After applying this rule this should happen:

example.com/test/test --> example.com/test
example.com/test/test2 --> no rewrite, because '2' does not match '/' or the end of the string
example.com/test/test/ --> example.com/test
example.com/sub/test/test --> example.com/sub/test

Debugging this rule

If you get an internal server error, please go to your apache error log and read what error it gives. Here is proof it works on a clean .htaccess on Apache 2.4.4 and, while it takes 1 minute to check an error log, it takes me several hours to read all patch notes for all Apache versions of the last 3 years.

External redirect, internal rewrite, preventing infinite loop

Assuming that above rule works on your version of mod_rewrite/apache/regex, the following construction will work to externally redirect your request, then internally rewrite it back. Please note that /test/test will not do anything sensible, unless you tell apache how to execute such a file. Proof of concept.

#The external redirect
RewriteCond %{THE_REQUEST} ^(GET|POST)\ /(.*?)([^/]+)/(?=\3(/|\ ))
RewriteRule ^(.*?)([^/]+)/(?=\2(/|$))([^/]+)/?$ /$1$4 [R,L]

#The internal rewrite
RewriteCond %{REQUEST_URI} !^/(.*?)([^/]+)/(?=\2(/|$))([^/]+)/?$
RewriteCond %{REQUEST_FILENAME} -d
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*/|)([^/]+)/?$ /$1$2/$2 [L]

You mention DirectorySlash Off. Please note that on current versions of Apache this would only get applied to an actual external request. While doing internal rewrites you are safe. In both examples above, in Apache 2.4.4, even though I redirect to an url without a trailing slash, Apache will still append a slash in a second redirect. I am clueless how this was handled in 1.3.


If Apache 1.3 doesn't support backreferences or lookaround in it's regex engine, which I still can't test, there is no real way of testing if an url contains two segments that are the same via mod_rewrite. You'll either need to use a custom router page or write out every url out there (which can cause performance issues, as that is likely a lot). Rewriting to a router page goes like this:

RewriteRule ^(.*)$ /myrouter.php?url=$1 [L]

This router page in a language of your choice can send the 301 or 302 header too with a custom location. It will need to handle all other requests too that are matched by the rewriterule above.

Sumurai8
  • 20,333
  • 11
  • 66
  • 100
  • @anubhava Actually, to be clearer I want www.domain.com/dan/ to point to www.domain.com/dan/dan but I want www.domain.com/dan/dan to look like www.domain.com/dan if that makes any sense. It is a roundabout way to remove the slash for Apache 1.3.42 since the modern DirectorySlash Off crashes my server. Anyways, I pasted your code into my .htaccess file and it crashed me too, so no dice. – Richard Morris Nov 17 '13 at 09:35
  • @RichardMorris You know, I am getting slightly annoyed with you. You post a lot of questions with lots of fancy words showing that you have no clue what you are talking about. That's fine with me. I answered this question, because the answer is not basic knowledge. How to externally redirect A to B and internally rewrite B to A is however basic knowledge, and can be found on every corner of the internet. DO SOME RESEARCH YOURSELF. There are no rewriterules that should ever *crash* apache (apache service stops). You are probably getting an internal server error. Find out what that is. – Sumurai8 Nov 17 '13 at 10:52
  • @RichardMorris Find out how to find out WHAT error occured. [Here](http://www.screenr.com/CeQH) is proof that this rule works just fine. I suppose that, because you are using an ancient Apache version, you are not only vulnerable to a wide range of attacks on your server, but also might be missing lookaround functionality in the regex engine. I am happy to teach things, but if I need to prepare everything for you so you can lazily copy/paste it (without actually learning from it), then please go to an other site. – Sumurai8 Nov 17 '13 at 10:53
  • @Sumurai8 Why are you feeling annoyed? I have only posted 4 questions at this website so far, I was not aware there was a limit? Fancy words, please clarify? Please explain why you think I don’t know what I’m talking about? I know the answer is not basic knowledge because if it was I would have found the answer already. “How to externally redirect A to B and internally rewrite B to A is however basic knowledge, and can be found on every corner of the internet.” Clearly this is not basic knowledge because you didn’t even know the answer your own self since your answer above failed. – Richard Morris Nov 17 '13 at 12:58
  • 1
    @Sumurai8 If you read my post in full, you would realize I have already done a lot of research. An internal error is a crash, if you mean catastrophic crash than no, not that. Please stop using such a patronizing tone with me, your abusive behavior is not appreciated. Ancient Apache, there is no need to talk down to users here, please stop this at once. I am happy to teach you too. I’m pretty sure admin wants me on their network clicking their ads, not on some other site so thanks anyways but no thanks. Now please clean up your tone. – Richard Morris Nov 17 '13 at 12:59
  • 1.3.42 is >3 year old and expecting it to behave as well as a recent version of apache is expecting that a horse and wagon will get you anywhere in a decent amount of time. I am annoyed, because you tell me that 'it crashes', but you didn't even think about looking at the error log to find out what actually happened. I am getting annoyed that, even though you only request an external redirect from A to B in your question, you expect me to magically understand you want to get from B to A too, even though A is still a directory and most likely will do nothing. – Sumurai8 Nov 17 '13 at 14:29
  • I have no way of testing this on your version of apache, so I can't tell you what is wrong. To get an external redirect and an internal rewrite back to work properly, you'll have to use `THE_REQUEST` with your version of apache. An example can be found [here](http://stackoverflow.com/a/11353528/2209007), [here](http://stackoverflow.com/a/12734160/2209007), [here](http://stackoverflow.com/a/16043629/2209007), [here](http://stackoverflow.com/a/15891703/2209007) and [here](http://stackoverflow.com/a/6520829/2209007). – Sumurai8 Nov 17 '13 at 14:37
  • @Sumurai8 Apache 1.3.42 might be 3 years old however considering many people are still using 10+ year old Windows XP it is not that old in comparison. You have no right to ever be annoyed with me. The error is RewriteRule: cannot compile regular expression. Put a lid on your condescending attitude, if you don’t know the answer then leave it at that because I don’t have time to read your ignorant rhetoric any longer. – Richard Morris Nov 18 '13 at 01:50
  • @Sumurai8 Read my original question, I am using THE_REQUEST with my version of Apache already. Five links that address unrelated issues I have already solved in my above question yet have nothing to do with the actual issue in my question, really. I have a feeling you could not solve this issue if I paid you to, so don’t bother “helping” me anymore unless it comes with an apology first. Goodbye. – Richard Morris Nov 18 '13 at 01:51
  • I've edited my answer with a solution to A->B, B->A or B->custom file, where the client sees B. All rules do, and did redirect to a version without a /, but if I don't have `DirectorySlash off` in my .htaccess, it will do a silent second redirect to the version with a slash. I am unsure how this works in 1.3.42, and I am unwilling to spend a lot of time installing outdated software that is incompatible with the rest of the software I am running. If nothing else works, you can also go for a solution with a router file in your favourite programming language that handles the redirect. – Sumurai8 Nov 18 '13 at 10:02
  • @Sumurai8 Nice to see you have appreciably cleaned up your tone, good work. Unfortunately I got another Internal Server Error which reads: RewriteCond: cannot compile regular expression '^(GET|POST)\\ /(.*?)([^/]+)/(?=\\3(/|\\ ))'\n At this juncture, I have probably spent too much time working on removing a simple slash, we’ll just have to consider this issue irresolvable. Thanks for trying nevertheless. – Richard Morris Nov 18 '13 at 10:42