0

I have a large dataset (csv file ) in my memory. I am reading dataset in my python environment. I have to split the dataset when it reaches 1 GB. So, my initial dataset is around 1.8 GB when I read. So, accordingly, I should have two datasets, one with 1 GB of size and the other with remaining.

How to do it ?

The solution should considered time and space complexity both

Payal Bhatia
  • 57
  • 1
  • 3
  • 9
  • Does this answer your question? [How to estimate how much memory a Pandas' DataFrame will need?](https://stackoverflow.com/questions/18089667/how-to-estimate-how-much-memory-a-pandas-dataframe-will-need) – endive1783 Jul 13 '22 at 11:40
  • yes , to a certain extent only. I found one strange thing. When I am reading it using memory_usage() function (sum(data.memory_usage(deep=True))) and adding all the numbers , it is leading to around 4.9 GB. However , the actual file is just of 1.3 GB ? – Payal Bhatia Jul 13 '22 at 13:12

0 Answers0