Does Hadoop officially support streaming with binary formats as of 0.21?
The hadoop-streaming.jar accepts an inputFormat
that is a Java class name. How do you provide the Hadoop streaming job this Java class? Besides executing hadoop-streaming.jar with inputFormat, outputFormat arguments, what else must be done to run a Python streaming job with SequenceFile input/output format?