I'm trying to read a 13GB
csv
file in using the following code:
chunks=pd.read_csv('filename.csv',chunksize=10000000)
df=pd.DataFrame()
%time df=pd.concat(chunks, ignore_index=True)
I have played with the values of chunksize
parameter from 10 ** 3 to 10 ** 7, but everytime I receive a MemoryError
. The csv
file has about 3.3 Million rows and 1900 columns.
I clearly see that I have 30+GB memory available before I start reading the file, but I'm still getting the MemoryError
. How do I fix this?