12

I have mapper and reducer executables written in C#. I want to use these with Hadoop streaming.

This is the command I'm using to create the Hadoop job...

hadoop jar $HADOOP_HOME/contrib/streaming/hadoop-streaming-*.jar 
-input "/user/hduser/ss_waits" 
-output "/user/hduser/ss_waits-output" 
–mapper "mono mapper.exe" 
–reducer "mono reducer.exe" 
-file "mapper.exe" 
-file "reducer.exe"

This is the error encountered by each mapper...

java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.LongWritable
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1014)
at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:592)
at org.apache.hadoop.mapred.lib.IdentityMapper.map(IdentityMapper.java:38)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:436)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)

Based on the call-stack, the problem seems to be that the (Java) IdentityMapper class is being used as the mapper. (Which explains why the type mismatch error was caused). The mapper should have been the executable "mono mapper.exe".

Any ideas why mono mapper.exe is not being used?

The mapper.exe and reducer.exe have the following permissions: -rwxr-xr-x

I am able to successfully execute mono mapper.exe from the unix command shell and have it read in text from stdin and write to stdout.

Environment:

  • Ubuntu Server 12.04 LTS (VM running on Azure)
  • Hadoop 1.0.4
  • Mono 2.10
user1793093
  • 129
  • 5
  • A trivial suggestion: If you're splitting your job submission command across multiple lines, are you writing \ at the end of each line (other than the last)? – Douglas Nov 16 '12 at 23:11
  • You can also try: create the wrapping script (http://www.mono-project.com/Guide:Running_Mono_Applications#Shell_Scripts) or create bundle (http://www.mono-project.com/Guide:Running_Mono_Applications#Bundles) – konrad.kruczynski Feb 13 '13 at 13:49

1 Answers1

1

Assuming mono is in the PATH, do you need the full path to mapper.exe and reducer.exe? i.e.

hadoop jar $HADOOP_HOME/contrib/streaming/hadoop-streaming-*.jar 
-input "/user/hduser/ss_waits" 
-output "/user/hduser/ss_waits-output" 
–mapper "mono /path/to/mapper.exe" 
–reducer "mono /path/to/reducer.exe" 
-file "mapper.exe" 
-file "reducer.exe"
joncham
  • 1,584
  • 10
  • 14