I have been trying to open a large .csv file in Google Colab for several hours now and keep getting the error:
ParserError: Error tokenizing data. C error: Calling read(nbytes) on source failed.
The code I'm using is:
path = "/content/drive/My Drive/data/big.csv"
#df_bonus = pd.read_csv(path, engine="python")
df_bonus = pd.read_csv(path)
Note: I've tried it without and without the engine="python" parameter. Using the engine="python" gives error
[Errno 5] Input/output error
which seems to be connected to hidden quotas.
In the past, I've gotten this error occasionally but after trying a few times, have been able to read the file. However, this time, it's failed maybe 30 time in a row.
I have tried with and without the engine="python", using no accelerator, using GPU and using TPU. I have also gotten the message once that I needed more RAM and accepted more RAM. However, I consistently am getting errors. Sometimes the error appears at once. Other times it appears after about five minutes.
Can anyone suggest a way to get this to open and why it's not opening now? Could it be that Google is just very busy? Or why the intermittence and why now am I not able to read the file at all?
Thanks for any suggestions.