My computer has no idea what this character is. It came from Excel.
In excel it was a weird space, now it is literally represented by several symbols viz. my computer has no idea what it is.
This character is represented by a Ê in Excel (in csv, as xls it is a space of some kind), OS X's TextEdit treats it as a big space this long " ", which is, I think, what it is. Ruby's CSV parser blows up when it tries to parse it using normal utf-8, and I have to add :encoding => "windows-1251:utf-8"
to parse it, in which case Ruby turns it into an "K". This K appears in groups of 9, 12, 15 and 18 (KKKKKKKKK, etc) in my CSV, and cannot be removed via gsub(/K/)
(groups of K, /KKKKKKKKK/, etc, cannot be removed either)! I've also used the opensource tool CSVfix, but its "removing leading and trailing spaces" command did not have an effect on the Ks.
I've tried using sed
as suggested in Remove non-ascii characters from csv, but got errors like
sed: 1: "output.csv": invalid command code o
when running something like sed -i 's/[\d128-\d255]//' input.csv
on Mac.