I am working with the data from https://opendata.rdw.nl/Voertuigen/Open-Data-RDW-Gekentekende_voertuigen_brandstof/8ys7-d773 (download CSV file using the 'Exporteer' button).
When I import the data into R using read.csv()
it takes 3.75 GB of memory but when I import it into pandas using pd.read_csv()
it takes up 6.6 GB of memory.
Why is this difference so large?
I used the following code to determine the memory usage of the dataframes in R:
library(pryr)
object_size(df)
and python:
df.info(memory_usage="deep")