13

Our hadoop cluster using snappy as default codec. Hadoop job reduce output file name is like part-r-00000.snappy. JSnappy fails to decompress the file bcz JSnappy requires the file start with SNZ. The reduce output file start with some bytes 0 somehow.

How could I decompress the file?

DeepNightTwo
  • 4,809
  • 8
  • 46
  • 60
  • 2
    Similar to a question asked on the hadoop mailing lists - http://mail-archives.apache.org/mod_mbox/hadoop-mapreduce-user/201305.mbox/%3C1165688733-1369155084-cardhu_decombobulator_blackberry.rim.net-1208212455-@b4.c16.bise7.blackberry%3E – Chris White Nov 06 '13 at 11:55
  • 2
    `hadoop fs -text snappy_file` works. Thanks! – DeepNightTwo Nov 08 '13 at 03:45

1 Answers1

18

Use "Hadoop fs -text" to read this file and pipe it to txt file. ex:

hadoop fs -text part-r-00001.snappy > /tmp/mydatafile.txt

arviarya
  • 650
  • 9
  • 9