4

I am trying to load a csv file (around 250 MB) as dataframe with pandas. In my first try I used the typical read_csv command but I receive an Error memory. I have tried the approach mentioned in Large, persistent DataFrame in pandas using chunks:

x=pd.read_csv('myfile.csv', iterator=True, chunksize=1000)
xx=pd.concat([chunk for chunk in x], ignore_index=True)

but when I tried to concatenate I received the following error: Exception: "All objects passed were None". In fact I can not access the chunks

I am using winpython 3.3.2.1 for 32 bits with pandas 0.11.0

Community
  • 1
  • 1
user2082695
  • 250
  • 3
  • 14

2 Answers2

2

I suggest that you install the 64 Bit version of winpython. Then you should be able to load a 250 MB file without problems.

w-m
  • 10,772
  • 1
  • 42
  • 49
0

I'm late, but the actual problem with the posted code is that using pd.concat([chunk for chunk in x]) effectively cancels any benefit of chunking because it concatenates all those chunks into one big DataFrame again.
That probably even requires twice the memory temporarily.

Norman
  • 1,975
  • 1
  • 20
  • 25