3

I need to save feather data in a more compact format. After making several tests I found that after using the "zip" function the size of the "feather" data file reduce up to 90%.

library(feather)
library(zip)

# Write data in feather format
write_feather(df, "data.feather)

# Write data in zip feather format
zip("data.feather.zip", "data.feather")

# How to have commands like these ones without using disk space for temp file?
write_feather(df, "data.feather.zip", format = "zip")
df <- read_feather("data.feather.zip", format = "zip")

So, there is a simple question - does anyway to save data as a "feather" file with let say argument "compressed = zip" to save disk space.

Thanks!

Andrii
  • 2,843
  • 27
  • 33
  • 2
    Check out the library `arrow`. You can save as .parquet with compression. Similar performance as with feather, but with much smaller files. –  Jan 17 '20 at 20:15
  • 1
    If it doesn't need to be feather, you can use `fst` to read and write compressed data frames. e.g. write with 80% compression `fst::write_fst(df, 'mydat.fst', compress = 80)` – IceCreamToucan Jan 17 '20 at 20:15
  • 1
    Also this might help, though did not directly answer your question, I know. https://stackoverflow.com/questions/1727772/quickly-reading-very-large-tables-as-dataframes/ –  Jan 17 '20 at 20:24
  • @IceCreamToucan Thanks for this great advice. I checked "fst". Looks very cool. The only one question - Is it possible to read "fst" files in Python? as it's possible for feather format. Thanks! – Andrii Jan 17 '20 at 20:42
  • 1
    No, not possible (at least with premade tools) at this time – IceCreamToucan Jan 17 '20 at 20:47

0 Answers0