5

A semi-well-known bug in Excel for Mac 2011 causes it to mangle imported .csv files containing accented Unicode characters. See: Microsoft Excel mangles Diacritics in .csv files?

For example, Pérez might get imported as Pérez.

My question: has this been fixed in Excel for Mac 2016?

Community
  • 1
  • 1

2 Answers2

8

No this has not been fixed in Excel for Mac 2016 and still miss-imports "Pérez". You can fix it in Excel by opening a New Worksheet, then "Import" and specifying UTF8 as the encoding. However there does not seem to be a way to construct a CSV that can be imported without knowing beforehand that it is going to Excel.

David
  • 1,018
  • 13
  • 22
Mitra Ardron
  • 343
  • 2
  • 8
  • 1
    Mitra, thanks, this is helpful. It appears that your answer was cut off. Do you want to complete it? – Prentiss Riddle Mar 07 '16 at 14:16
  • 1
    Thanks for pointing this out Prentiss. What I was saying was there does not seem to be a way to construct a CSV that can be imported without knowing beforehand that it is going to Excel. However since writing this comment on Saturday I have discovered there is a trick to writing a "BOM" string to the top of the file. This is well documented e.g. at http://chase-seibert.github.io/blog/2014/07/30/international-csv-files-python.html . – Mitra Ardron Mar 08 '16 at 20:02
  • 1
    Thanks, Mitra. That Chase Seibert article appears to cover the case of producing CSV files from scratch in Python. I don't suppose you know a simple solution for taking an existing CSV file and inserting the BOM? Although Seibert adds something which makes me wonder whether inserting the BOM does any good with Excel for Mac anyway: "After some more research, it turns out that Mac Excel simply does not support UTF-8 at all. So forget it ever working there." – Prentiss Riddle Mar 09 '16 at 22:12
  • 1
    The comment to forget about UTF-8 in Mac Excel depends on whether you are concerned with working with UTF-8 text in Excel or whether you are answering the OP's question in a strict sense. The fact that it is possible to import UTF-8 text by BOM insertion (see my answer) means you can get your work done if you have the software to do the insertion. I have spent the afternoon doing this. Certainly it is reprehensible that Microsoft makes one have to do this, and the strict answer to the question is NO. (But what do you expect from Microsoft?) – David Mar 15 '17 at 23:55
  • 1
    Another point, I do not find an ‘Import’ option on the Mac Excel, unlike the Windows. Are you sure? That is why I specify ‘Open’ in my workaround. – David Mar 16 '17 at 18:20
  • David, I do see an "Import" option in Excel for Mac 15.17 under File > Import. Are you looking in the wrong place? – Prentiss Riddle Mar 16 '17 at 19:50
  • And returning to this after a year, I see that I misunderstood Mitra's original answer. I *can* import Unicode files through the regular File Import Wizard if I explicitly set "File origin" to "Unicode (UTF-8)" in the first screen of the wizard. In my recollection, that wasn't possible in Excel for Mac 2011. – Prentiss Riddle Mar 16 '17 at 19:55
0

This is a workaround for my own particular UTF-8 problem, refers to tab-separated values rather than CSV, and also involves proprietary software*. Hence it may not be of general applicability. However I am posting it in case anyone else finds it useful or is able to adapt it to their problem.

I work with a UTF-8 MySQL database which contains one field with UTF-8 text because of Greek symbols. After making an SQL query I often paste from my Mac terminal into the BBEdit text editor, do some minor clean up, and then import into Excel for distribution and presentation. Having read the answer from @MitraArdon, the following allows me to preserve the UTF-8:

  1. Set the BBEdit text document as Unicode (UTF-8, with BOM)† using the selection options at the bottom of the page.

  2. Launch Excel 2016 and import the text file (as delimited text — I use the pipes from the MySQL tables as delimiters). Save as .xlsx.

There is no step 3 (as the ad used to say).

For the reverse process I cannot do the normal “Save as tab delimited text” as the Greek letters will be replaced by underscores. As the document is in UTF-16 one must:

  1. Select “Save as UTF-16 Unicode Text (.txt)”

  2. Open in BBEdit and change the format to UTF-8 (generally ‘No BOM’).

Then I can upload back into my UTF-8 MySQL database, if required.

Footnotes

*Actually BareBones’ cutdown free text editor, TextWrangler, should also work as it has the option to save as UTF-8 BOM.

†Or convert to UTF-16 BOM.

David
  • 1,018
  • 13
  • 22