I have a very big boolean vector (like a=c(TRUE, TRUE,FALSE, FALSE) but times larger) and would like to store it in a file as compact as possible. What is the easiest way to do it?
Thanks,
I have a very big boolean vector (like a=c(TRUE, TRUE,FALSE, FALSE) but times larger) and would like to store it in a file as compact as possible. What is the easiest way to do it?
Thanks,
As the linked question suggests, saving as a binary rds
file using saveRDS
is the best option, provided that you only want to use the resulting file with R, rather than any other programs.
If your vector doesn't have any missing values, you can convert the logical vector to a bit vector, which takes up half as much space on disk. (It also uses less memory in your workspace.)
library(bit)
x <- runif(1e6) > 0.5
x2 <- as.bit(x)
saveRDS(x, "x.rds") # takes up 246kb
saveRDS(x2, "x2.rds") # takes up 123kb
If you need to reuse the variable in other programs, then choose a format that that program can read! HDF5 is a common, compact format that may be suitable.