Python 3 Pandas Error: pandas.parser.CParserError: Error tokenizing data. C error: Expected 11 fields in line 5, saw 13

Question

I checked out this answer as I am having a similar problem.

However, for some reason ALL of my rows are being skipped.

My code is simple:

import pandas as pd

fname = "data.csv"
input_data = pd.read_csv(fname)

and the error I get is:

  File "preprocessing.py", line 8, in <module>
    input_data = pd.read_csv(fname) #raw data file ---> pandas.core.frame.DataFrame type
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/pandas/io/parsers.py", line 465, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/pandas/io/parsers.py", line 251, in _read
    return parser.read()
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/pandas/io/parsers.py", line 710, in read
    ret = self._engine.read(nrows)
  File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/pandas/io/parsers.py", line 1154, in read
    data = self._reader.read(nrows)
  File "pandas/parser.pyx", line 754, in pandas.parser.TextReader.read (pandas/parser.c:7391)
  File "pandas/parser.pyx", line 776, in pandas.parser.TextReader._read_low_memory (pandas/parser.c:7631)
  File "pandas/parser.pyx", line 829, in pandas.parser.TextReader._read_rows (pandas/parser.c:8253)
  File "pandas/parser.pyx", line 816, in pandas.parser.TextReader._tokenize_rows (pandas/parser.c:8127)
  File "pandas/parser.pyx", line 1728, in pandas.parser.raise_parser_error (pandas/parser.c:20357)
pandas.parser.CParserError: Error tokenizing data. C error: Expected 11 fields in line 5, saw 13

So somehow we're supposed to reverse-engineer from the error your data that produced it? Please post sample raw input data — EdChum, Apr 20 '15 at 17:46
It looks like your CSV doesn't have the same number of fields on every line. Try opening it in Excel or your favorite spreadsheet program to verify its structure. — MattDMo, Apr 20 '15 at 17:50
This description got me here and this was the same problem I had. +1 for that. — calmrat, Aug 02 '15 at 18:15
Dynamically generate column names for variable number of columns for read_csv(): https://stackoverflow.com/a/52890095/1427624 — P-S, Oct 19 '18 at 10:01

score 10 · Answer 1 · answered Apr 20 '15 at 17:59

10

Solution is to use pandas built-in delimiter "sniffing".

input_data = pd.read_csv(fname, sep=None)

answered Apr 20 '15 at 17:59

user1452494

1,145
5
18
40

score 5 · Answer 2 · answered Aug 30 '16 at 20:19

5

For those landing here, I got this error when the file was actually an .xls file not a true .csv. Try resaving as a csv in a spreadsheet app.

answered Aug 30 '16 at 20:19

Kate Stohr

99
2
5

Wow. Thank you. Nothing was working and I spent like 2 hours googling how to figure this out. I tried everything! Turns out, the "csv" sent to me was actually a "txt" file, not a true csv. I have no idea how that even happened, since it ends in ".csv" but thank you! – ArthurH Sep 12 '19 at 18:00

score 2 · Answer 3 · answered Sep 22 '17 at 03:24

2

I had the same error, I read my csv data using this : d1 = pd.read_json('my.csv') then I try this d1 = pd.read_json('my.csv', sep='\t') and this time it's right. So you could try this method if your delimiter is not ',', because the default is ',', so if you don't indicate clearly, it go wrong. pandas.read_csv

answered Sep 22 '17 at 03:24

ShenDu

21
2

Amazing! Thank you a lot! It solved my problem too. – SnowBG Apr 01 '19 at 13:07

score 1 · Answer 4 · answered Aug 11 '19 at 22:18

This error means, you get unequal number of columns for each row. In your case, until row 5, you've had 11 columns but in line 5 you have 13 inputs (columns).

For this problem, you can try the following approach to open read your file:

import csv
with open('filename.csv', 'r') as file:
    reader = csv.reader(file, delimiter=',')  #if you have a csv file use comma delimiter
    for row in reader:
        print (row)

score 0 · Answer 5 · edited May 23 '17 at 12:02

0

This parsing error could occur for multiple reasons and solutions to the different reasons have been posted here as well as in Python Pandas Error tokenizing data.

I posted a solution to one possible reason for this error here: https://stackoverflow.com/a/43145539/6466550

edited May 23 '17 at 12:02

Community

1
1

answered Apr 03 '17 at 14:37

computerist

872
8
9

score 0 · Answer 6 · edited Jul 19 '19 at 04:49

0

I have had similar problems. With my csv files it occurs because they were created in R, so it has some extra commas and different spacing than a "regular" csv file.

I found that if I did a read.table in R, I could then save it using write.csv and the option of row.names = F.

I could not get any of the read options in pandas to help me.

edited Jul 19 '19 at 04:49

VerySeriousSoftwareEndeavours

1,713
3
31
57

answered Jul 19 '19 at 01:55

BrianM

48
6

score 0 · Answer 7 · answered Apr 14 '20 at 22:48

0

The problem could be that one or multiple rows of csv file contain more delimiters (commas ,) than expected. It is solved when each row matches the amount of delimiters of the first line of the csv file where the column names are defined.

answered Apr 14 '20 at 22:48

jmish

3
3

score 0 · Answer 8 · answered Jul 19 '22 at 12:58

0

use \t+ in the separator pattern instead of \t.

import pandas as pd

fname = "data.csv"
input_data = pd.read_csv(fname, sep='\t+`, header=None)

answered Jul 19 '22 at 12:58

Derrick Kuria

159
1
10

Python 3 Pandas Error: pandas.parser.CParserError: Error tokenizing data. C error: Expected 11 fields in line 5, saw 13

8 Answers8

Linked