1

I am currently using the following to read 2GB csv file but it gives memory error on the third line.

import pandas as pd
import csv

SimData    =   pd.read_csv(r'C:\MyFileofSize2GB.csv')
columns    =   SimData.columns.tolist()

I intend to segregate each column and store them into separate csv files.

Edit:

MyFileofSize2GB.csv has 10 columns and I want to create 10 separate csv files after reading it in Python, one file for each column. But it fails when I try to read MyFileofSize2GB.csv in Python and gives memory error.

Zanam
  • 4,607
  • 13
  • 67
  • 143
  • Then, do it, why are you asking here? – hd1 Dec 22 '15 at 18:09
  • Question is about memory error. – Zanam Dec 22 '15 at 18:19
  • if you read in separate CSV files, the memory will be reduced. Now, if you have tried this, and it still gives you an out of memory error, that's a different story from the one you put in your question. – hd1 Dec 22 '15 at 19:13
  • Sorry I think I was not clear in my question so I added some statements to it to make it clearer. – Zanam Dec 22 '15 at 19:20
  • 1
    Might it not be easier to split the csv file along rows, since you'd be reading it in line by line in python anyway? – LateCoder Dec 22 '15 at 19:22
  • You might look at [Dask DataFrames](http://dask.pydata.org/en/latest/dataframe.html) for this – mattexx Dec 22 '15 at 19:26
  • Maybe this thread helps: http://stackoverflow.com/questions/11622652/large-persistent-dataframe-in-pandas/12193309#12193309 – Shahram Dec 23 '15 at 04:29

0 Answers0