6

I have tried to follow the instructions on this page: http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/

$bin/hadoop jar contrib/streaming/hadoop-streaming-1.0.4.jar -input /user/root/wordcountpythontxt -output /user/root/wordcountpythontxt-output -mapper /user/root/wordcountpython/mapper.py -reducer /user/root/wordcountpython/reducer.py -file /user/root/mapper.py -file /user/root/reducer.py

It says

File: /user/root/mapper.py does not exist, or is not readable.
Streaming Command Fail

When i browsed through the url:jobdetails.jsp/

i found lot of exception

java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 9 more
Caused by: java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    ... 14 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 17 more
Caused by: java.lang.RuntimeException: configuration exception
    at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:230)
    at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
    ... 22 more
Caused by: java.io.IOException: Cannot run program "/user/root/wordcountpython/mapper.py": error=2, No such file or directory
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
    at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:214)
    ... 23 more
Caused by: java.io.IOException: error=2, No such file or directory
    at java.lang.UNIXProcess.forkAndExec(Native Method)
    at java.lang.UNIXProcess.<init>(UNIXProcess.java:53)
    at java.lang.ProcessImpl.start(ProcessImpl.java:91)
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
    ... 24 more

i am not able to fix it out pls help me to run the python pgm.

USB
  • 6,019
  • 15
  • 62
  • 93

3 Answers3

3

If you checked the instructions carefully on the link,

hduser@ubuntu:/usr/local/hadoop$ bin/hadoop jar contrib/streaming/hadoop-*streaming*.jar -file /home/hduser/mapper.py -mapper /home/hduser/mapper.py -file /home/hduser/reducer.py -reducer /home/hduser/reducer.py -input /user/hduser/gutenberg/* -output /user/hduser/gutenberg-output

there it clearly shows there is no need to copy the mapper.py and reducer.py to the HDFS, you can link both the files from the local filesystem: as /path/to/mapper. I am sure you can avoid the above error.

Anil Muppalla
  • 416
  • 4
  • 14
3

You might want to check that you don't have a dos-style new line after your #! line within mapper.py. If you do, hadoop may not be able to find your python interpreter since it'll see an extra CR. E.g. /usr/local/bin/python^M instead of /usr/local/bin/python where ^M is CR. Try dos2unix command on both your mapper and reducer.

John Pickard
  • 308
  • 1
  • 8
  • how will i do that to mapper and reducer – USB Mar 21 '13 at 04:00
  • ...sorry... just run `dos2unix mapper.py`, assuming you have dos2unix installed. And then the same for your reducer.py. You could grep for the dos new line to confirm it's there before and gone afterwards: [http://stackoverflow.com/questions/73833/how-do-you-search-for-files-containing-dos-line-endings-crlf-with-grep-under-l]. Also, if this is your problem then you might consider changing how your editor saves files. E.g. if you use Eclipse the default is for dos-style new lines – John Pickard Mar 22 '13 at 11:38
1

It seems problem is in line.

Caused by: java.io.IOException: Cannot run program "/user/root/wordcountpython/mapper.py": error=2, No such file or directory

Will you please check file /user/root/wordcountpython/mapper.py exists or not. If it exists then what is the permission to that file.

User by which you are running hadoop has the permission to execute and read this file?

Nilesh
  • 20,521
  • 16
  • 92
  • 148