2

Recently I rebuilt some forms so they would accept International Domains like

http://例子.测试

I would then store these as punycode. Before this though we had validation code running to check whether the domain was valid using Coldfusions isValid() method:

if(not isValid("url", sURL)){
return false;
}

With punycode we have an issue of the isValid() function failing when the domain is like:

http://例子.测

or when it is converted to it's punycode using CreateObject( "java", "java.net.IDN" ).toASCII(sURL); and in certain cases comes out like:

xn--http://133ab--xn328sdf

(made up example but in certain cases there will be characters before the http:// part)

is there currently a way of using either a Java library or Coldfusion library or regex to validate IDNs and "normal" domains?

Jarede
  • 3,310
  • 4
  • 44
  • 68

2 Answers2

3

an IDN is a representation of the domain name only, you appear to including the protocol (http://) in the string you are converting.

Remove the protocol first either with a simple replace() or using java.net.URL, then if required recombine the protocol and idn after.

<cfscript>
oUrl = createobject("java", "java.net.URL").init("httP://例子.测试");
protocol = oUrl.getProtocol();
domain = oUrl.getHost();
idn = createobject("java", "java.net.IDN").toAscii(domain);
writeDump(protocol);  // -> http
writeDump(domain);    // -> 例子.测试
writeDump(idn);       // -> xn--fsqu00a.xn--0zwm56d
</cfscript>

Once you have the punycode you should then be able to use isValid() on it.

Chris Blackwell
  • 2,138
  • 17
  • 22
  • For future users: you need to seperate the domain and protocol, then recombine them when using isValid. However, if you are storing it, use the toAscii function on the original URL as when you try to toUnicode, if the URL doesn't have the xn-- before the http:// it won't display the unicode for the domain correctly. – Jarede May 11 '12 at 11:12
  • If you are storing it you should store it in a valid form. You can always strip the protocol off again before converting back to unicode and then recombine with the protocol. – Chris Blackwell May 11 '12 at 15:24
0

Maybe the answers here might help How to send email to recipient with umlauts in domain name?

They propose to use

<cfset jUrl = CreateObject( "java", "java.net.IDN" ).toUnicode(sUrl) />

or the LibIdn Java library

Community
  • 1
  • 1
Cyril Hanquez
  • 686
  • 3
  • 6