0

I am trying to write a data stored in a dataframe to a csv and am running into the following error:

UnicodeEncodeError: 'charmap' codec can't encode characters in position 202-203: character maps to <undefined>

I looked at this thread and this one and tried their solutions of adding encoding='utf-8' however this did not seem to help. I was wondering if there is another encoding recommendation/ alternative solution to this problem. The code for my function that writes to the csv is as follows:

def data_to_csv(data, data2):
    encode = 'utf-8'
    with open('data.csv', 'w') as data_csv:
        data.to_csv(path_or_buf=data_csv, encoding=encode)
    with open('data2_data.csv', 'w') as data2_csv:
        data2.to_csv(path_or_buf=data2_csv, encoding=encode)

Any help would be greatly appreciated!

Buzzkillionair
  • 319
  • 3
  • 18
  • 1
    So, what does your data look like? Esp. at position 202-203? – MSpiller Jan 04 '23 at 15:25
  • I will see if I can figure that out @M.Spiller either way i would assume utf-8 should allow it to work because they are all keys you would find on an english keyboard – Buzzkillionair Jan 04 '23 at 15:40
  • @M.Spiller it appears to be an emoji. I can see how this may cause issues. Any idea regarding how to deal with that issue? Also, is it even switching encoding to utf-8 based on the error? – Buzzkillionair Jan 04 '23 at 15:44
  • 1
    I would try to write to a UTF-8 file. I.e. pass `encoding='utf-8'` to the call to `open` (not (only) to `to_csv`) – MSpiller Jan 04 '23 at 15:46
  • This fixed it! Thank you @M.Spiller !! If you want to put this as the solution to the problem I would be happy to mark it as the correct solution! – Buzzkillionair Jan 04 '23 at 15:52
  • 1
    Sure. Added as answer. – MSpiller Jan 04 '23 at 16:32

1 Answers1

2

According to the error message, the content of data does contain some characters that cannot be encoded as charmap.

You try to write the data encoded as UTF-8, but in order to do so, you should open/create your .csv files with UTF-8 encoding:

with open('data.csv', 'w', encoding='UTF-8') as data_csv:

should work.

MSpiller
  • 3,500
  • 2
  • 12
  • 24