2

I have been trying to open a large .csv file in Google Colab for several hours now and keep getting the error:

ParserError: Error tokenizing data. C error: Calling read(nbytes) on source failed.

The code I'm using is:

path = "/content/drive/My Drive/data/big.csv"
#df_bonus = pd.read_csv(path, engine="python")
df_bonus = pd.read_csv(path)

Note: I've tried it without and without the engine="python" parameter. Using the engine="python" gives error

[Errno 5] Input/output error

which seems to be connected to hidden quotas.

In the past, I've gotten this error occasionally but after trying a few times, have been able to read the file. However, this time, it's failed maybe 30 time in a row.

I have tried with and without the engine="python", using no accelerator, using GPU and using TPU. I have also gotten the message once that I needed more RAM and accepted more RAM. However, I consistently am getting errors. Sometimes the error appears at once. Other times it appears after about five minutes.

Can anyone suggest a way to get this to open and why it's not opening now? Could it be that Google is just very busy? Or why the intermittence and why now am I not able to read the file at all?

Thanks for any suggestions.

user6631314
  • 1,751
  • 1
  • 13
  • 44

0 Answers0