1

In another question it was asked how to replace Umlaute. The accepted question was the following code:

# encoding: utf-8
foo = "ich bin doch nicht blöd, mann!".gsub(/[äöü]/) do |match|
    case match
        when "ä" 'ae'
        when "ö" 'oe'
        when "ü" 'ue'
    end
end
puts foo

However, when I try to run this, the output is:

$ ruby /tmp/test.rb 
ich bin doch nicht bld, mann!

So the Umlaute obviously don't get replaced. Is there something I am missing? I'm using Ruby 1.9.3p362 (2012-12-25 revision 38607) [x86_64-linux]

Community
  • 1
  • 1
Markus
  • 1,772
  • 1
  • 12
  • 20

3 Answers3

7

You are using incorrect syntax, you either need to use then or a newline and indentation.

# encoding: utf-8
foo = "ich bin doch nicht blöd, mann!".gsub(/[äöü]/) do |match|
    case match
        when "ä" then 'ae'
        when "ö" then 'oe'
        when "ü" then 'ue'
    end
end

puts foo

or

# encoding: utf-8
foo = "ich bin doch nicht blöd, mann!".gsub(/[äöü]/) do |match|
    case match
        when "ä"
            "ae"
        when "ö"
            "oe"
        when "ü"
            "ue"
    end
end

The robust way to do this would be result = Iconv.iconv('ascii//ignore//translit', 'utf-8', foo) but you need to set locale to "de_DE" which cannot be done in ruby without a c extension.

Esailija
  • 138,174
  • 23
  • 272
  • 326
  • Its interesting ruby doesn't throw a warning for missing then statements! +1 – cggaurav Jan 11 '13 at 22:27
  • 6
    @cggaurav Usually leaving out the `then` would be a syntax error, but writing two string literals next to each other is actually syntactically valid. Writing `"ä" 'ae'` is the same as writing `"äae"`. That's a syntactic oddity that Ruby inherited from C. – sepp2k Jan 11 '13 at 22:36
  • I read discussions on Ruby core developers site about deprecating this feature when the two literals are on the same line. It seems to be the consensus that it is not useful. – sawa Jan 11 '13 at 22:49
2
"ich bin doch nicht blöd, mann!".gsub("ä","ae").gsub("ö","oe").gsub("ü","ue")

Should do the trick

cggaurav
  • 565
  • 5
  • 20
  • That was fast and works. Thanks! Any idea why the other code won't work? (I will accept your answer in 10 minutes, since this is the minimum time to wait before accepting) – Markus Jan 11 '13 at 22:22
  • 2
    This gets ugly real quick (think Ä, ä, Ö, ö, Ü, ü and ß; é è etc.) – steenslag Jan 11 '13 at 22:59
2

(Not a real answer to the question, but a bit large for a comment.) gsub has syntax for this kind of substitution, using a Hash:

#encoding: utf-8
table = {"ä" => 'ae',
         "ö" => 'oe',
         "ü" => 'ue'}
re = Regexp.union(table.keys)
# re = /[äöü]/ # is fine too
p "ich bin doch nicht blöd, mann!".gsub(re, table)
# => "ich bin doch nicht bloed, mann!"
steenslag
  • 79,051
  • 16
  • 138
  • 171