I have a very simple bit of Scala code
var str = "≤"
for( ch <- str ) { printf("%d, %x", ch.toInt, ch.toInt) ; println }
println
str = "\u2264" ;
for( ch <- str ) { printf("%d, %x", ch.toInt, ch.toInt) ; println }
In case that doesn't show properly on your browser, the first string contains one character, between double-quotes, which is the less-or-equal-to sign U+2264.
The program outputs
8218, 201a
226, e2
167, a7
8804, 2264
Clearly the first string is 3 characters long at run time, not 1 character long as it is in the source file.
The source file is stored in UTF-8. A hex dump shows that it is encoded properly, the first string being 22 E2 89 A4 22. I'm using Eclipse and the Scala plugin for Eclipse.
- Does the scala compiler accept input files encoded in UTF-8?
- If so, why does my program produce unexpected results?