6

If I use a domain such as www.äöü.com, is there any way to avoid it being displayed as www.xn--4ca0bs.com in users’ browsers?

Domains such as www.xn--4ca0bs.com cause a lot of confusion with average internet users, I guess.

IMSoP
  • 89,526
  • 13
  • 117
  • 169
user1360250
  • 331
  • 1
  • 3
  • 14
  • Not using IDNs at all would be a solution. ;) – ThiefMaster Jun 13 '12 at 07:06
  • The answers here are obsolete as of 2023; browsers generally display names in the human-readable way now, provided the the name is readable *in any single culture.* Browsers now disregard homographs when deciding how to display domain names (but FYI, registries have rules regarding registering homograph domains). – arnt Feb 02 '23 at 10:51

3 Answers3

10

This is entirely up to the browser. In fact, IDNs are pretty much a browser-only technology. Domain names cannot contain non-ASCII characters, so the actual domain name is always the Punycode encoded xn--... form. It's up to the browser to prettify this, but many choose to not do so to avoid domain name spoofing using lookalike Unicode characters.

deceze
  • 510,633
  • 85
  • 743
  • 889
  • Domain names can contain non-ASCII characters. Russia, China and Serbia already allow users to register such domains. See http://www.w3.org/International/articles/idn-and-iri/ – Milan Babuškov Jun 13 '12 at 06:45
  • 7
    But the *actual* domain name in the DNS system cannot. That's why they're encoded into the `xn--...` Punycode format. – deceze Jun 13 '12 at 06:47
  • 2
    Hence the name IDNA: International Domain Names **in Applications**. – TRiG Nov 20 '13 at 15:25
  • This is really interesting. I'm just learning about this now. [Here is](https://chromium.googlesource.com/chromium/src/+/master/docs/idn.md) chrome's policy for deciding whether to prettify internationalized domain names. – bbsimonbb Apr 27 '21 at 13:31
6

From a security perspective, Unicode domains can be problematic because many Unicode characters are difficult to distinguish from common ASCII characters (or indeed other Unicode characters).

It is possible to register domains such as "xn–pple-43d.com", which is equivalent to "аpple.com". It may not be obvious at first glance, but "аpple.com" uses the Cyrillic "а" (U+0430) rather than the ASCII "a" (U+0061). This is known as a homograph attack.

Fortunately modern browsers have mechanisms in place to limit IDN homograph attacks. The page IDN Policy on chrome highlights the conditions under which an IDN is displayed in its native Unicode form. Generally speaking, the Unicode form will be hidden if a domain label contains characters from multiple different languages. The "аpple.com" domain as described above will appear in its Punycode form as "xn–pple-43d.com" to limit confusion with the real "apple.com".

For more information see this blog post by Xudong Zheng.

IMSoP
  • 89,526
  • 13
  • 117
  • 169
learner
  • 61
  • 1
  • 7
0

Internet Explorer 8.0 on Windows 7 displays your UTF-8 domain just fine. Google Chrome 19 on the other hand doesn't.

Read more here: An Introduction to Multilingual Web Addresses #phishing.

Different browsers to things differently, possibly because some use the system codepage/locale/encoding/wtvr. And others use their own settings, or a list of allowed characters.

Read that article carefully, it explains how each browser works when making a decision. If you are targeting a specific language, you can get away with it and make it work.

oxygen
  • 5,891
  • 6
  • 37
  • 69