If I have a command line program with input and output like this:
md5sum < hadoop-2.7.2.tar.gz
c442bd89b29cab9151b5987793b94041 -
How can I run it using Hadoop? This seems to be an embarassingly simple problem, but none of the solutions I tried have produced the correct output:
- Custom Binary Input - Hadoop
- Distributed Processing of Volumetric Image Data
- Hadoop Streaming Job with binary input?
Maybe, I just wasn't able to follow the instructions correctly. So, please, explain in some detail or point at least at helpful documentation.