I have problem with using strings in UTF-8 format, e.g. "\u0161\u010D\u0159\u017E\u00FD". When such string is defined as variable in my program it works fine. But when I use such string by reading it from some external file I get the wrong output (I don't get what I want/expect). Definitely I'm missing some necessary encoding stuff...
My code:
file = "c:\\...\\vlmList_unicode.txt" #\u306b\u3064\u3044\u3066
data = File.open(file, 'rb') { |io| io.read.split(/\t/) }
puts data
data_var = "\u306b\u3064\u3044\u3066"
puts data_var
Output:
\u306b\u3064\u3044\u3066 # what I don't want
について # what I want
I'm trying to read the file in binary form by specifying 'rb' but obviously there is some other problem... I run my code in Netbeans 7.3.1 with build in JRuby 1.7.3 (I tried also Ruby 2.0.0 but without any effect.)
Since I'm new in ruby world any ideas are welcomed...