I can't find any information about this and am not sure what other keywords I could google, so apologies if this is a duplicate.
I have some some lists of data.tables in my workspace, as displayed here:
> lsos()
Type Size PrettySize Rows Columns
all_subsets list 46673512 44.5 Mb 3 NA
glm_Macro.part_1 list 15817064 15.1 Mb 2 NA
glm_Macro.part_2 list 15817064 15.1 Mb 2 NA
glm_Macro.part_3 list 15289864 14.6 Mb 2 NA
I then need to save the last three items in the list to disk. I do this simply using save()
and the .rda
extension, e.g.
save(glm_Macro.part_1, file = "glm_Macro.part_1.rda")
Looking on the disk, however, the size of the three respective files are 270.7, 268.8 and 262.6 MB. This is ~18 times larger.
Is there a known reason for this?
My only hunch is the way data.table
uses referencing, meaning data is not copied, rather just referenced from the original data set. See here for an example of how that works.
So when I save the data to disk, maybe it forces the copying of all data.tables, where referencing was doing enough within the R workspace.
Terminal, Rstudio and ESS (Emacs) all show the same sizes in the workspace, so it is not related to the environment it seems.