1

We have users who need to be able to export data to a csv, which they open in Excel on mac machines, that support utf-8 characters.

NOTE: We don't want our users to have to go to the data tab, click import from text, then... We want them to be able to open the file immediately after downloading and have it display the correct info.

At first, I thought this was just an encoding/decoding problem since we are using python 2.7 (actively working on upgrading to python 3.6), but after that was fixed, I discovered Excel was the cause of the problem (as the csv works fine when opened in a text editor or even Numbers). The solution i am trying involves adding the utf-8 BOM to the beginning of the file as I read somewhere that this would let Excel know that it requires utf-8.

#Here response is just a variable that is valid when used like this and 
#we can export CSV's fine that don't need utf-8    
writer = csv.writer(response)
writer.writerow("0xEF0xBB0xBF")

I was hoping that but just adding the utf-8 BOM to the beginning of the csv file like this would allow Excel to realize it needed to use utf-8 encoding when opening this file, but alas it does not work. I am not sure if this is because Excel for MAC doesn't support this or if I simply added the BOM incorrectly.

Edit: I'm not sure why I didn't mention it, as it was critical in the solution, but we are using Django. I found this stack overflow post that gave the solution (which I've included below).

Poe Warner
  • 11
  • 4
  • I did find [this](https://donatstudios.com/CSV-An-Encoding-Nightmare) website that says using utf-18le encoding is the proper way, but I couldn't get that to work either. I tried using the information in [this](https://stackoverflow.com/questions/5202648/adding-bom-unicode-signature-while-saving-file-in-python) stackoverflow post, but this doesn't work because `'_csv.writer' object has no attribute 'write'`, which makes sense, but I don't know what to use instead. – Poe Warner Jun 18 '19 at 21:20

1 Answers1

0

Because we are using Django, we were able to just include:

response.write('\xEF\xBB\xBF')

before creating a csv writer and adding the content to the csv.

Another idea that probably would have lead to a solution is opening the file normally, adding the BOM, and then creating a csv writer (Note: I did not test this idea, but if the above solution doesn't work for someone/they aren't using Django, it is an idea to try).

Poe Warner
  • 11
  • 4
  • For a better solution, consider using the 3rd party `unicodecsv` module which corrects the problems with Python 2.7's csv module (Python 3 can handle Unicode strings), or look at the UnicodeReader/UnicodeWriter examples in the [csv documentation](https://docs.python.org/2.7/library/csv.html#examples). – Mark Tolonen Jul 18 '19 at 02:13
  • Since are porting to Python 3 soon we just needed a quick fix until that time. But for others who are not switching to Python 3 Mark's comment is probably a better solution if they are looking for something more permanent/standard. – Poe Warner Jul 19 '19 at 19:32