5

Unfortunately, the Unicode 0.1 (sudo gem install unicode) doesn't work on Ruby 1.9. I have the following snippet:

require "rubygems"
require "unicode"

str = "áéíóúç"
Unicode.normalize_KD(str).gsub(/[^\x00-\x7F]/n, "")
#=> aeiouc

I use it to convert titles to permalink, without removing accented characters.

Is there a way of converting such texts using pack or unpack methods?

Nando Vieira
  • 964
  • 10
  • 17
  • You should dig through the ruby-talk archives. I am pretty sure that within the last few days/weeks, the author of the Unicode gem announced a new version there, and I'm also pretty sure that Ruby 1.9 was discussed in that thread. I didn't actually read the thread, though, so I don't have any specifics. – Jörg W Mittag Aug 24 '09 at 18:18
  • Actually, that was *not* the Unicode gem, but the Unicode-Utils gem mentioned by @molf below. (It also wasn't in the past few days, it was 3 months ago.) – Jörg W Mittag Aug 25 '09 at 11:17

2 Answers2

13

Update: a better option may be to use the gem unicode_utils that was created specifically for these missing features:

require "unicode_utils"
UnicodeUtils.nfkd("áéíóúç").gsub(/[^\x00-\x7F]/,'').to_s
#=> "aeiouc"

Is there a possibility you can depend on Rails' ActiveSupport? Then you can do the following:

require "activesupport"
mb_str = ActiveSupport::Multibyte::Chars.new("áéíóúç")
mb_str.normalize(:kd).gsub(/[^\x00-\x7F]/,'').to_s
#=> "aeiouc"

ActiveSupport::Multibyte was written to bring UTF-8/Unicode support to Ruby 1.8, but works fine in 1.9 too. You may be able to borrow some of the code if you don't want it as an external dependency.

Cœur
  • 37,241
  • 25
  • 195
  • 267
molf
  • 73,644
  • 13
  • 135
  • 118
  • Nice replacement! I use Unicode gem on a Rails plugin, so sure I can rely on it! Thanks a lot! Would be nice to have a replacement using Ruby 1.9 only though, because there are a lot of gems that use it and ActiveSupport may not be a choice for all of them. – Nando Vieira Aug 24 '09 at 16:33
  • That's the one I was thinking about in my comment above. Thanks for reminding me about that! I'm actually going to need that right now in my toy project. – Jörg W Mittag Aug 25 '09 at 11:18
  • I think is better to use the unicode-utils gem instead of ActiveSupport when replacing unicode on gems! Thanks! ;) – Nando Vieira Aug 26 '09 at 12:09
1

There is also I18n.transliterate('string') method in Rails. Works like a charm.

Marcin Adamczyk
  • 499
  • 3
  • 13