Unsure if CSV file is encoded as utf-8 using csv.writer

Question

This is similar to my other post in that I can't tell if the character encoding is being preserved once the *.csv is written to.

If I open and write to the csv, then open and read it--both times including encoding='utf-8' in the parameters--it appears to be utf-8.

path_to_file = os.path.join(r'C:\Users\jpm\Downloads', 'c19_Vaccine_Current.csv')

#write to file
with open(path_to_file, 'w', newline='', encoding='UTF-8') as csvfile:
    f = csv.writer(csvfile) 
    #write the headers of the csv file
    f.writerow(['County','AdminCount','AdminCountChange', 'RollAvg', 'AllocDoses', 'FullyVaccinated',                    'FullyVaccinatedChange', 'ReportDate', 'Pop', 'PctVaccinated', 'LHDInventory', 'CommInventory',
                'TotalInventory', 'InventoryDate'])

with open(path_to_file, 'r', encoding='UTF-8') as r:
    print(r)

#prints this
>>> <_io.TextIOWrapper name='C:\\Users\\jpm\\Downloads\\c19_Vaccine_Current.csv' mode='r' encoding='UTF-8'>

But, if I simply open it (as this post does) the encoding is cp1252.

with open(r'C:\Users\jpm\Downloads\c19_Vaccine_Current.csv') as f:
    print(f)

#prints this
>>> <_io.TextIOWrapper name='C:\\Users\\jpm\\Downloads\\c19_Vaccine_Current.csv' mode='r' encoding='cp1252'>

I can open a *.csv in Notepad --> set the encoding to utf-8 --> and save as a *.csv. That works. But, I'm asking how do I ensure the *.csv is set to utf-8 using my script?

Seems that the default encoding for the `open` call is "cp1252" on your machine. It's your responsibility as a programmer to provide the correct encoding in your call to `open`. Python won't guess. — Matthias, Mar 17 '21 at 15:59
Not sure I understand. On first script you explicitly set "utf-8" (and this is really the things to do: one should specify the encoding). On the second example: do not trust the result. It depends on python version, on environment, and on operating system. You may vet cp1252 in some Windows environment, but it is not guaranteed — Giacomo Catenazzi, Mar 17 '21 at 16:01
The encoding that is defaulted to in `open` doesn't mean that the file is actually encoded like that. You already explicitly use utf-8 when you created your file originally — juanpa.arrivillaga, Mar 17 '21 at 16:10

Unsure if CSV file is encoded as utf-8 using csv.writer

0 Answers0

Linked