3

is there a way to figure out what the uncompressed file size is for a parquet file compressed in snappy? I have a lot of parquet files in a HDFS directory and I'm trying to figure out if there is a way to calculate the file size if that data was uncompressed.

lightweight
  • 3,227
  • 14
  • 79
  • 142

1 Answers1

0

You can just try to uncompress the data and see how much space it consumes. See How to decompress the hadoop reduce output file end with snappy?

Maybe there's a more elegant way I'm not aware of

Community
  • 1
  • 1
Lior Chaga
  • 1,424
  • 2
  • 21
  • 35