0

I am trying to use Hadoop streaming with python scripts, but unfortunately I am getting following error:

14/08/23 13:31:50 INFO streaming.StreamJob: To kill this job, run:
14/08/23 13:31:50 INFO streaming.StreamJob: UNDEF/bin/hadoop job  -Dmapred.job.tracker=localhost.localdomain:8021 -kill job_201408210627_0018
14/08/23 13:31:50 INFO streaming.StreamJob: Tracking URL: http://0.0.0.0:50030/jobdetails.jsp?jobid=job_201408210627_0018
14/08/23 13:31:51 INFO streaming.StreamJob:  map 0%  reduce 0%
14/08/23 13:32:17 INFO streaming.StreamJob:  map 100%  reduce 100%
14/08/23 13:32:17 INFO streaming.StreamJob: To kill this job, run:
14/08/23 13:32:17 INFO streaming.StreamJob: UNDEF/bin/hadoop job  -Dmapred.job.tracker=localhost.localdomain:8021 -kill job_201408210627_0018
14/08/23 13:32:17 INFO streaming.StreamJob: Tracking URL: http://0.0.0.0:50030/jobdetails.jsp?jobid=job_201408210627_0018
14/08/23 13:32:17 ERROR streaming.StreamJob: Job not successful. Error: NA
14/08/23 13:32:17 INFO streaming.StreamJob: killJob...
Streaming Command Failed!

I am trying to run following command:

hadoop jar /usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming-2.0.0-mr1-cdh4.7.0.jar -input "/user/cloudera/vecs" -output "/user/cloudera/vecs_output" -file /home/cloudera/vects/streaming/mapper.py -mapper mapper.py -file /home/cloudera/vects/streaming/reducer.py -reducer reducer.py -jobconf mapred.map.tasks=20 -jobconf mapred.reduce.tasks=1

When I look into the job seetup I can see:

java.lang.Throwable: Child Error
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:250)
Caused by: java.io.IOException: Task process exit with nonzero status of 1.
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:237)

Everything seems to work well when I try to run my code without hadoop with command:

head -100 ./data/vecs.txt|./streaming/mapper.py|./streaming/reducer.py

I have also read through this post and I have #! /usr/bin/env python2.7 as the first line of my python files.

Does anyone has idea what can possibly be wrong? Thank you in advance for any suggestions and answers.

Community
  • 1
  • 1
ziky90
  • 2,627
  • 4
  • 33
  • 47

1 Answers1

0

So I have made everything to work properly by setting up new instance of OS with Hadoop from scratch.

Now I'm just curious what might have been the problem on my old Cloudera virtualized machine?

ziky90
  • 2,627
  • 4
  • 33
  • 47