I am using Python version 3.5.3 and Pandas version 0.20.1
I use read_csv to read in csv files. I use a file pointer according to this post (I prefer this over the solution using _enablelegacywindowsfsencoding()
). The following code works:
import pandas as pd
with open("C:/Desktop/folder/myfile.csv") as fp:
df=pd.read_csv(fp, sep=";", encoding ="latin")
This does work. However, when there is a special character like ä in the filename as follows:
import pandas as pd
with open("C:/Desktop/folderÄ/myfile.csv") as fp:
df=pd.read_csv(fp, sep=";", encoding ="latin")
Python displays an error message: (unicode error) 'utf-8' codec can't decode byte oxc4 in position 0: unexpected end of data.
I also tried to add a 'r' before the filepath, however I get the same error message, except that now I get a position as integer number which is exactly where my special character is in the filepath.
So the reason is the special character in the filepath name.
(Not a decode error which can be solved by using encoding="utf-8" or any other like ISO-5589-1. To be absolutely sure, I tried it with the following encodings and always got the same error message: utf-8, ISO-5589-1, cp1252)