Why did my DataFrame grow in size after using df.to_csv() and df = pd.read_csv() operations?

Question

Doing a quick check to see the size of my DataFrame shows that it's 2.4 GB and all columns are in float32 format.

df.info(memory_usage="deep")

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1468379 entries, 1516035660 to 1604138340
Columns: 441 entries, Open to 2099520 EMA
dtypes: float32(441)
memory usage: 2.4 GB

I save the DataFrame to my pc and the reload it back using the following:

df.to_csv("CSV of BTC Price Plus Indicators.csv")
df = pd.read_csv('/Users/Moonboi/Coding/Crypto/CSV of BTC Price Plus Indicators.csv').set_index("Timestamp")

Then check the memory usage:

df.info(memory_usage="deep")

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1468379 entries, 1516035660 to 1604138340
Columns: 441 entries, Open to 2099520 EMA
dtypes: float64(441)
memory usage: 4.8 GB

Now it takes up double the space and uses float64 instead of float32? Whats this all about? Can't I keep it in float32 format?

I thin [this](https://stackoverflow.com/a/50643113/8953890) might have the solution you are lookkng for. — Pooja Sonkar, Jul 05 '21 at 13:38
Thanks, so this enabled me to read the csv into my workbook and only took up 2.4gb, but I still cant save it and have it take up less than 5gb+ of space in my pc — MoonBoi9001, Jul 05 '21 at 14:00

Why did my DataFrame grow in size after using df.to_csv() and df = pd.read_csv() operations?

0 Answers0