0

When I navigate to aртем.example.com, the web browser will convert the URL to xn--a-jtbvok.example.com. Is there a way to convert xn--a-jtbvok back to aртем using PHP?

Currently I am using $_SERVER['HTTP_HOST'] to fetch the requested URL.

Live examples here: aртем.lekintepls.se and åäö.lekintepls.se.

I have no idea what this phenomena is called, so I'm sorry if this is a duplicate.

  • I would recommend using only *standard* characters in URLs, but why don't you just check if it equals `xn--a-jtbvok`? `if ($_SERVER['HTTP_HOST'] === 'xn--a-jtbvok') { /* ... */ }` – tleb Jul 05 '15 at 20:01
  • I want the subdomain to be displayed on the page, just like it is now. It's literally the site's only purpose. Making checks for every possible subdomain isn't possible. – Andreas Vennström Jul 05 '15 at 20:06
  • As said [here](http://stackoverflow.com/a/1916747/4255615), URLs should only contain the *standard* characters available. – tleb Jul 05 '15 at 20:19

2 Answers2

2

This notation is called Punycode or IDNA (Internationalizing Domain Names in Applications).

You can use the idn_to_utf8 and idn_to_ascii to convert between to two notations.

Dekel
  • 60,707
  • 10
  • 101
  • 129
0

As quoted here, the RFC 1738 says that:

Thus, only alphanumerics, the special characters "$-_.+!*'(),", and reserved characters used for their reserved purposes may be used unencoded within a URL.

This suggests that unicode characters should not be used in URLs.

Community
  • 1
  • 1
tleb
  • 4,395
  • 3
  • 25
  • 33