0

Apache: Why on IE .htaccess rewrites non Latin character URLs to urlencode? How to solve it?

domain.com//контакты rewrites to domain.com/%D0%BA%D0%BE%D0%BD%D1%82%D0%B0%D0%BA%D1%82%D1%8B

RewriteEngine On
# Rewrite multiple slashes with single slash after domain
RewriteCond %{THE_REQUEST} ^[A-Z]+\s//+(.*)\sHTTP/[0-9.]+$ [OR]
RewriteCond %{THE_REQUEST} ^[A-Z]+\s(.*/)/+\sHTTP/[0-9.]+$
RewriteRule .* http://%{HTTP_HOST}/%1 [R=301,NE,L]
Binyamin
  • 7,493
  • 10
  • 60
  • 82
  • 1
    This is actually correct: Unicode characters are not valid in URLs - they need to be percent encoded. What is the exact problem? Is URLEncoded data being passed to you script? – Pekka Feb 09 '11 at 22:07
  • @Pekka: Yes. On newest browsers any URLs has UTF-8 encoding. And they rewrites domain.com//контакты to domain.com/контакты, like it is excepted to be. – Binyamin Feb 09 '11 at 22:12

2 Answers2

1

Seems there still are no solution for IE and it depends of browser address-bar encoding. Like Chrome 12 shows UTF-8 url domain.com/контакты, but on Ctr+C, Ctr+P you will get Unicode url http://domain.com/%D0%BA%D0%BE%D0%BD%D1%82%D0%B0%D0%BA%D1%82%D1%8B

Binyamin
  • 7,493
  • 10
  • 60
  • 82
0

@Pekka: Yes. On newest browsers any URLs has UTF-8 encoding. And they rewrites domain.com//контакты to domain.com/контакты, like it is excepted to be.

As far as I know, this is not entirely correct and this may be the reason why it doesn't work for you.

What new browsers are really doing is they show you the URL in its native form контакты, but internally still make the request in percent encoded form.

This seems to be the crux: You type something in, IE recognizes the Unicode characters, but the subsequent 301 redirect leads to a (correct) percent encoded URL. A modern browser will automatically "decode" the percent encoded URL; IE and older browsers might not.

I don't know whether there is a workaround for this. One thing you could try is have a PHP script do the URL rewriting, and have that script redirect to the (however invalid!) URLDecoded URL:

<?php header("location: http://example.com/контакты"); ?>

I have no idea whether this'll work, though.

Reference: Unicode characters in URLs

Related IE setting

Community
  • 1
  • 1
Pekka
  • 442,112
  • 142
  • 972
  • 1,088
  • @Binyamin not inside .htaccess, but I just edited in a potential workaround idea that you could try out. – Pekka Feb 09 '11 at 22:24
  • @Pekka: `` returns decoded URL even on IE. Do you have some PHP solution to decoded URL if it has encoded characters? – Binyamin Feb 09 '11 at 22:43
  • @Binyamin hmm, you should be able to do a `urldecode()` on the incoming URL, and to forward to the result. Be sure to test it in all browsers though, this is unknown territory (and as said, not valid in the strict sense of the law) – Pekka Feb 09 '11 at 22:47
  • How to make for IE PHP `header("location: ...")` from `urlencode` to `urldecode` utf-8 when IE `echo $_GET['id']` responds `контакты` even if url in address bar is `%D0%BA%D0%BE%D0%BD%D1%82%D0%B0%D0%BA%D1%82%D1%8B`? – Binyamin Feb 10 '11 at 17:23
  • @Binyamin I don't understand your question. Can you clarify? – Pekka Feb 10 '11 at 17:24
  • @Pekka: How to check with PHP if string in address bar is urlencoded? Even when IE shows me in address bar domain.com/%D0%BA%D0%BE%D0%BD%D1%82%D0%B0%D0%BA%D1%82%D1%8B, `echo $_GET['id'];` returns me `контакты` and not `%D0%BA%D0%BE%D0%BD%D1%82%D0%B0%D0%BA%D1%82%D1%8B`. – Binyamin Feb 10 '11 at 17:43
  • @Binyamin if you do a `urldecode()` on a string that isn't URL encoded, you should get the same string back. You should be able to just do a `urldecode()` and work with the result – Pekka Feb 10 '11 at 17:44
  • Unfortunately it does not help when Apache rewrites it anyway to urlencoded url; check the .htaccess I wrote upper in my question. – Binyamin Feb 10 '11 at 17:55
  • @Binyamin my suggestion was not to use Apache's rewriting at all, but instead doing it in PHP – Pekka Feb 10 '11 at 17:58
  • @Pekka: Then is any any PHP solution to rewrite multiple slashes with single slash after domain, when index.php is located not in root dir, but in domain.com/some-dir/index.php, probably not? – Binyamin Feb 10 '11 at 18:03
  • @Binyamin it should be possible to rewrite all incoming requests to that PHP file - I can't see why not. – Pekka Feb 10 '11 at 18:04
  • @Pekka: Then how I rewrite link domain.com//some-dir/ to domain.com/some-dir/ trough PHP file domain.com/some-dir/index.php ? – Binyamin Feb 10 '11 at 19:38