14

I have a Ruby script that generates a UTF8 CSV file remotely in a Linux machine and then transfers the file to a Windows machine thru SFTP.

I then need to open this file with Excel, but Excel doesn't get UTF8, so I always need to open the file in a text editor that has the capability to convert UTF8 to ANSI.

I would love to do this programmatically using Ruby and avoid the manual conversion step. What's the easiest way to do it?

PS: I tried using iconv but had no success.

chills42
  • 14,201
  • 3
  • 42
  • 77
Dema
  • 6,867
  • 11
  • 40
  • 48

4 Answers4

18
ascii_str = yourUTF8text.unpack("U*").map{|c|c.chr}.join

assuming that your text really does fit in the ascii character set.

AShelly
  • 34,686
  • 15
  • 91
  • 152
  • That did it as well and it didn't need to use iconv at all. Thanks! – Dema Nov 04 '08 at 20:35
  • 1
    Note, if `c` is greater than 255, it will fail (since it is out of the ASCII range). – Sam Soffes Nov 05 '10 at 22:53
  • 4
    In order to fix the non-ASCII error that Sam ran into you can use the following: yourUTF8text.unpack("U*").map{|c|c.chr rescue '_' }.join – metavida Apr 06 '11 at 17:20
13

I finally managed to do it using iconv, I was just messing up the parameters. So, this is how you do it:


require 'iconv'

utf8_csv = File.open("utf8file.csv").read

# gotta be careful with the weird parameters order: TO, FROM !
ansi_csv = Iconv.iconv("LATIN1", "UTF-8", utf8_csv).join

File.open("ansifile.csv", "w") { |f| f.puts ansi_csv }

That's it!

Dema
  • 6,867
  • 11
  • 40
  • 48
8

I had a similar issue trying to generate CSV files from user-generated content on the server. I found the unidecoder gem which does a nice job of transliterating unicode characters into ascii.

Example:

"olá, mundo!".to_ascii                 #=> "ola, mundo!"
"你好".to_ascii                        #=> "Ni Hao "
"Jürgen Müller".to_ascii               #=> "Jurgen Muller"
"Jürgen Müller".to_ascii("ü" => "ue")  #=> "Juergen Mueller"

For our simple use case, this worked well.

Pivotal Labs has a great blog post on unicode transliteration to ascii discussing this in more detail.

markquezada
  • 8,444
  • 6
  • 45
  • 52
8

Since ruby 1.9 there is an easier way:

yourstring.encode('ASCII')

To avoid problems with invalid (non-ASCII) characters you can ignore the problems:

yourstring.encode('ASCII', invalid: :replace, undef: :replace, replace: "_")
knut
  • 27,320
  • 6
  • 84
  • 112