0

I have an xlsx file, I do the following to export it as a .csv :

  • Export xlsx file as csv using excel, default encoding
  • Open .csv file with notepad, save it again by specifying the encoding utf8 (notepad saves the BOM)
  • Open the file with CSV.read(path_to_file)

It seems to work well, but for some reason the first header is corrupted by some unknown character (I have no idea what it is, and when I try to copy paste it it disappears, it is represented as a big white rectangle in Windows)

Unknown character

When I open my file with any text editor, there doesn't seem to be a problem

First line looks like : Id;Type....

In case this helps

csv.headers.first # => ".Id" where . is that character
csv.headers.first.first.bytes # => [239, 187, 191]
csv.headers.first.first.b # => "\xEF\xBB\xBF"

How do I fix that ?

Windows 10, Ruby 2.2

Cyril Duchon-Doris
  • 12,964
  • 9
  • 77
  • 164

1 Answers1

1

That is the UTF-8 BOM. Try setting the mode like this:

CSV.read(path_to_file, 'r:bom|utf-8')