Read large CSV files in Python

Asked Oct 28 '16 at 18:07

Active Sep 21 '17 at 13:41

Viewed 185 times

I am using an i7 laptop with 8gb RAM with no viruses, I am trying to read a 2.2 gb CSV file into python dataframe using the following code:

  tp = pd.read_csv(r'..\train_ver2\train_ver2.csv',iterator =True, chunksize=10000)
  df =pd.concat(tp,ignore_index =True)
  print(df)

but mu system just hangs everytime I run the command. Is their a problem with my code or is my system not efficient enough? Any thoughts or suggestions are welcome. Thankyou!

edited Sep 22 '17 at 17:44

Community

asked Oct 28 '16 at 18:07

Uasthana

1,645
5
16
24

Googling for "python large csv site:stackoverflow.com" will show you a lot of existing questions to help fix this. You can certainly make your code handle the CSV file more intelligently. – skrrgwasme Oct 28 '16 at 18:28
You may also get better results from just reducing your `chunksize` parameter. – skrrgwasme Oct 28 '16 at 18:30
@skrrgwasme. Thanks for the tip, I was finally able to run the query above looks like the chunksize parameter is not working. The query returns all 11 million records instead of the 10000 I mentioned for some reason. – Uasthana Oct 28 '16 at 18:38
@skrrgwasme oh I just figured what we use chunksize for.....comment above look stupid now – Uasthana Oct 28 '16 at 18:51

Read large CSV files in Python

0 Answers0