0

I'm using Twint to create a .csv file with ten results. But whenever I try to load it into a pandas dataframe, I get an error. Can someone help me to understand what is going on?

Traceback (most recent call last):
  File "k:\Documents\Visual Studio Code\Twitter Project\exploratory stage.py", line 4, in <module>
    scrapedData = pd.read_csv('demo.csv')
  File "K:\Programs\Python\lib\site-packages\pandas\util\_decorators.py", line 311, in wrapper
    return func(*args, **kwargs)
  File "K:\Programs\Python\lib\site-packages\pandas\io\parsers\readers.py", line 586, in 
read_csv
    return _read(filepath_or_buffer, kwds)
  File "K:\Programs\Python\lib\site-packages\pandas\io\parsers\readers.py", line 488, in 
_read
    return parser.read(nrows)
  File "K:\Programs\Python\lib\site-packages\pandas\io\parsers\readers.py", line 1047, in read
    index, columns, col_dict = self._engine.read(nrows)
  File "K:\Programs\Python\lib\site-packages\pandas\io\parsers\c_parser_wrapper.py", line 223, in read
    chunks = self._reader.read_low_memory(nrows)
  File "pandas\_libs\parsers.pyx", line 801, in pandas._libs.parsers.TextReader.read_low_memory
  File "pandas\_libs\parsers.pyx", line 857, in pandas._libs.parsers.TextReader._read_rows
  File "pandas\_libs\parsers.pyx", line 843, in pandas._libs.parsers.TextReader._tokenize_rows
  File "pandas\_libs\parsers.pyx", line 1925, in pandas._libs.parsers.raise_parser_error 
pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 3, saw 3

-Edit-

I looked at my csv file and realized that the data was formatted strangely. One whole line of information including the username, date time and tweet would all be cramped into a cell.

And for a few other rows, the tweets would break off and continue in the cell next to it. It looks something like this.

Screenshot of my data

Kynan E
  • 41
  • 1
  • 4
  • 1
    As with all questions, please post code that you used to get this error However answer seems to be here: error: Expected 1 fields in line 3, saw 3 – Gregory Sky Oct 14 '21 at 08:18
  • please also post what your data inside .csv looks like – Mahrkeenerh Oct 14 '21 at 08:19
  • 1
    try `error_bad_lines=False` if this works then it is a repost of https://stackoverflow.com/questions/18039057/python-pandas-error-tokenizing-data – Akmal Soliev Oct 14 '21 at 08:21

2 Answers2

1

Whenever you ask a pandas question, you should always, if possible, provide a few lines of your data s.t. people can help you more efficiently.

The error states that your third line contains 3 fields where it expects only 1.

This can happen if your CSV is formatted incorrectly. The solution, in your case, is to fix the format or try setting error_bad_lines=False.

This example throws the same error:

from io import StringIO
import pandas as pd

data = """name
brad
susi,tina,ellen
peter
"""

pd.read_csv(StringIO(data))

Output:

ParserError: Error tokenizing data. C error: Expected 1 fields in line 3, saw 3

Solution

Fix the CSV file or try setting error_bad_lines=False will skip faulty lines

df = pd.read_csv(StringIO(data), error_bad_lines=False)
print(df)

Output:

Note the missing row susi,tina,ellen

    name
0   brad
1  peter

  exec(code_obj, self.user_global_ns, self.user_ns)
b'Skipping line 3: expected 1 fields, saw 3\n'
Stefan Falk
  • 23,898
  • 50
  • 191
  • 378
  • Thanks for the reply. I looked at my csv file and realized that the data was formatted strangely. One whole line of information including the username, date time and tweet would all be cramped into a cell. And for a few other rows, the tweets would break off and continue in the cell next to it. – Kynan E Oct 15 '21 at 14:21
-1

When the csv is occupied by another programm or application it can happen that the OS will "lock" the file up untill the operation is finished.

Asure that wen you create a .csv that you tell the os tho clone / end the operation

this is a example for opening / closing a file:

f.open(test.csv, "w")
f.write("test")
f.close()

withoput the f.close the file is "locked" up by the OS and can't be accesed by another programm / process

Passi
  • 127
  • 1
  • 13
  • it is often that some other process is blocking the file for acsess. It can be a soulution. – Passi Oct 14 '21 at 09:09
  • 1
    The error is `pandas.errors.ParserError: Error tokenizing data. C error: Expected 1 fields in line 3, saw 3`. The file is already open. idk who keeps upvoting your answer. – Stefan Falk Oct 14 '21 at 09:09