I am having a german csv file, which I want to read with pd.read_csv
.
Data:
The original file looks like this:
So it has two Columns (A,B) and the seperator should be ';'
,
Problem: When I ran the command:
dataset = pd.read_csv('C:/Users/.../GermanNews/articles.csv',
encoding='utf-8', header=None, sep=';')
I get the error:
ParserError: Error tokenizing data. C error: Expected 2 fields in line 3, saw 3
Half-Solution: I understand this could have several reasons, but when I ran the command:
dataset = pd.read_csv('C:/Users/.../GermanNews/articles.csv',
encoding='utf-8', header=None, sep='delimiter')
I get the following dataset back:
0
0 Etat;Die ARD-Tochter Degeto hat sich verpflich...
1 Etat;App sei nicht so angenommen worden wie ge...
2 Etat;'Zum Welttag der Suizidprävention ist es ...
3 Etat;Mitarbeiter überreichten Eigentümervertre...
4 Etat;Service: Jobwechsel in der Kommunikations...
so I only get one column instead of the two desired columns,
Target: any idea how to load the dataset correctly that I have:
0 1
0 Etat Die ARD-Tochter Degeto hat sich verpflich...
1 Etat App sei nicht so angenommen worden wie ge...
Hints/Tries:
When I run the search function over my data in excel, I am also not finding any ;
in it.
It seems like that some lines have more then two columns (as you can see for example in line 3 and 13 of my example