Questions tagged [punycode]

Punycode is a encoding syntax by which a Unicode (UTF-8) string of characters can be translated into the basic ASCII-characters permitted in network host names. Examples: mañana.com, bücher.com and café.com.

Punycode is a encoding syntax by which a Unicode (UTF-8) string of characters can be translated into the basic ASCII-characters permitted in network host names. Punycode is used for internationalized domain names, in short IDN or IDNA (Internationalizing Domain Names in Applications).

For example, when you would type café.com in your browser, your browser (which is the IDNA-enabled application) first converts the string to punycode "xn--caf-dma.com", because the character 'é' is not allowed in regular domain names. Punycode domains won't work in older browsers.

Examples:

  • mañana.com
  • bücher.com
  • café.com.
80 questions
23
votes
2 answers

Converting punycode with dash character to Unicode

I need to convert the punycode NIATO-OTABD to nñiñatoñ. I found a text converter in JavaScript the other day, but the punycode conversion doesn't work if there's a dash in the middle. Any suggestion to fix the "dash" issue?
Lindsay
  • 856
  • 1
  • 9
  • 13
17
votes
1 answer

Amazon SES - Non - ASCII Characters in e-mail address

I'm trying to send an e-mail using Amazon SDK for .NET and SES. I have an e-mail which consists of special letters, for example: ęxąmplę@źćż.com For the domain part, i read about Punycode and that works fine. But for the local part of the…
13
votes
5 answers

Can punycode-encoded email addresses clash with "real" addresses?

The problem is this: I'm using a third-party Email delivery service that doesn't accept mail addresses with non-ASCII characters in the name part, like müller@example.com . Encoding such an address with…
Martin T.
  • 3,132
  • 1
  • 30
  • 31
12
votes
3 answers

What is the maximum length of an IDNA converted domain name?

First things first: I'm storing multiple domains to a database, after I've converted each and every domain name to it's IDNA version. What I need to know the maximum length such an IDNA-converted domain name can have so I can define the database…
user1093284
8
votes
2 answers

Node.js Emoji Parsing

I'm trying to parse an incoming string to determine whether it contains any non-emojis. I've gone through this great article by Mathias and am leveraging both native punycode for the encoding / decoding and regenerate for the regex generation. I'm…
thekevinscott
  • 5,263
  • 10
  • 44
  • 57
7
votes
2 answers

Unicode domain name in Nginx server_name

I am trying to set up a server with a domain name called "privatinstruktør.dk" but keeps getting redirected to the default "welcome to nginx" page. I have tried to type in the server_name like this: server { listen 80; server_name…
Theis Borg
  • 378
  • 4
  • 16
6
votes
1 answer

CookieContainer does not store cookies for internationalized domain names

I'm trying to perform authorization on a cyrillic domain using WebClient. Authorization goes through few stages with redirects between normal and punicode domains. The problem is HttpWebRequest can not store cookies in assigned CookieContaier if it…
Leff
  • 582
  • 3
  • 12
6
votes
3 answers

Is there any way to avoid showing "xn--" for IDN domains?

If I use a domain such as www.äöü.com, is there any way to avoid it being displayed as www.xn--4ca0bs.com in users’ browsers? Domains such as www.xn--4ca0bs.com cause a lot of confusion with average internet users, I guess.
user1360250
  • 331
  • 1
  • 3
  • 14
5
votes
1 answer

Python Convert punycode back to unicode

I'm trying to add contacts to Sendgrid from a db which occasionally is storing the user email in punycode example-email@xn--yaho-sqa.com which translates to example-email@yahóo.com in Unicode. Anyway if I try and add the ascii version there's an…
locose
  • 63
  • 1
  • 6
5
votes
3 answers

Can I use non latin characters in my robots.txt and sitemap.xml?

Can I use non latin characters in my robots.txt file and sitemap.xml like this? robots.txt User-agent: * Disallow: /somefolder/ Sitemap: http://www.domainwithåäö.com/sitemap.xml sitemap.xml
user1087110
  • 3,633
  • 11
  • 34
  • 43
4
votes
1 answer

IDNA does not round-trip

I have some IDNA encoded strings that I cannot decode. In Python, I try u"xn--grohandel-shop-2fb".decode("idna") and get the error "IDNA does not round-trip". The same for "xn--sottmqqo5-lgbe9b7no0hmz9u". I'm stumped, and Googling the error doesn't…
Steve
  • 4,033
  • 5
  • 32
  • 29
4
votes
2 answers

punycode proper email addresses

When using an email that has unicode characters such as josé@abç.РФ Do you need to punycode convert both sides or just the right hand side? josé@xn--ab-5ia.xn--s0ai or xn--jos-dma@xn--ab-5ia.xn--s0ai
jimlongo
  • 365
  • 2
  • 5
  • 21
4
votes
1 answer

How do I know when to do a UTF8 or punycode DNS query?

I have an application with an address bar, and users type in an IRI to which I must connect. On unix/Darwin, this is simple: I flatten the IDN to a URI as described in RFC3987. That is, if the scheme has an authority section, I map that to ASCII…
Nicholas Wilson
  • 9,435
  • 1
  • 41
  • 80
3
votes
1 answer

Do browsers encode in punycode only domain or whole url?

I was reading about IDN homograph atack and didn't find exactly stated does browsers encode in punycode only domain or rest of the URL is included (path and query). So my question is does one of popular browsers (FF, IE, Chrome, Safari, Opera)…
Antonio Bakula
  • 20,445
  • 6
  • 75
  • 102
3
votes
1 answer

How can I rewrite s domain name to the original IDN not the punycode?

I have bought an IDN domain name with non-latin characters. It is good but when I access the domain name, the address bar shows the punycode for the domain not the actual domain which will be hard for any user to remember. Is there is anyway I can…
alhoseany
  • 761
  • 9
  • 28
1
2 3 4 5 6