3

How can I make ruby to_yaml method to store utf8 strings with original signs but not escape sequence?

Bogdan Gusiev
  • 8,027
  • 16
  • 61
  • 81

4 Answers4

7
require 'yaml'
YAML::ENGINE.yamler='psych'
'Résumé'.to_yaml # => "--- Résumé\n...\n"

Ruby ships with two YAML engines: syck and psych. Syck is old and not maintained, but it is default in 1.9.2, so one needs to switch to psych. Psych dumps UTF-8 strings in UTF-8.

Evgenii
  • 36,389
  • 27
  • 134
  • 170
  • 1
    Note that this answer only works for Ruby 1.9.3 (where Psych is already the default). The above code does not work for Ruby 1.9.2 (`no such file to load -- psych`). – Phrogz Mar 18 '12 at 13:50
  • 1
    …unless you first install the `psych` gem. – Phrogz Mar 18 '12 at 16:00
3

This is probably a really bad idea as I'm sure YAML has its reasons for encoding the characters as it does, but it doesn't seem too hard to undo:

require 'yaml'
require 'yaml/encoding'

text = "Ça va bien?"

puts text.to_yaml(:Encoding => :Utf8) # => --- "\xC3\x87a va bien?"
puts YAML.unescape(YAML.dump(text)) # => --- "Ça va bien?"
tadman
  • 208,517
  • 23
  • 234
  • 262
  • It was reasonable in the past to use ASCII encoding by default, but this is not the case now. And the manual says: "YAML streams are encoded using the set of printable Unicode characters, either in UTF-8 or UTF-16.". So I think it's just a limitation in Ruby's library, to_yaml should return UTF-8 by default. Otherwise it's really burdensome to modify those YAML with an editor. – tokland Mar 11 '11 at 13:31
  • 1
    dump sometimes returns you a binary type: YAML.unescape(YAML.dump("sú")) -> --- !binary | c8O6 – tokland Mar 16 '11 at 11:01
3

Checkout Ya2Yaml at RubyForge.

2

For Ruby 1.9.3+, this is not a problem: the default YAML engine is Psych, which supports UTF-8 by default.

For Ruby 1.9.2- you need to install the psych gem and require it before you require yaml:

irb(main):001:0> require 'yaml'
#=> true
irb(main):002:0> require 'psych'
#=> true
irb(main):003:0> YAML::ENGINE
#=> #<YAML::EngineManager:0x00000001a1f642 @yamler="syck">
irb(main):004:0> "ça va?".to_yaml
#=> "--- \"\\xC3\\xA7a va?\"\n"
irb(main):001:0> require 'psych' # gem install psych
#=> true
irb(main):002:0> require 'yaml'
#=> true
irb(main):003:0> YAML::ENGINE
#=> #<YAML::EngineManager:0x00000001a1f828 @yamler="psych">
irb(main):004:0> "ça va bien!".to_yaml
#=> "--- ça va bien!\n...\n"

Alternatively, set the yamler as Evgeny suggests (assuming you have installed the psych gem):

irb(main):001:0> require 'yaml'
#=> true
irb(main):002:0> YAML::ENGINE.yamler
#=> "syck"
irb(main):003:0> "ça va?".to_yaml
#=> "--- \"\\xC3\\xA7a va?\"\n"
irb(main):004:0> YAML::ENGINE.yamler = 'psych'
#=> "psych"
irb(main):005:0> "ça va".to_yaml
#=> "--- ça va\n...\n"
Phrogz
  • 296,393
  • 112
  • 651
  • 745