0

Does Hadoop officially support streaming with binary formats as of 0.21?

The hadoop-streaming.jar accepts an inputFormat that is a Java class name. How do you provide the Hadoop streaming job this Java class? Besides executing hadoop-streaming.jar with inputFormat, outputFormat arguments, what else must be done to run a Python streaming job with SequenceFile input/output format?

Community
  • 1
  • 1
T. Webster
  • 9,605
  • 6
  • 67
  • 94

1 Answers1

3

It loos like some muddy documentation.

Try reading these answers

How to use Hadoop Streaming with LZO-compressed Sequence Files?

Python read file as stream from HDFS

and don't worry about Sequencefile. It's not worth it.

Community
  • 1
  • 1
node
  • 90
  • 3