1

I'm trying to chain multiple jobs into a single job in Hadoop (I'm using API version 1.2.1). I came across an article on the topic, see here.

My main class is as follows: http://pastebin.com/C21PKM1j (I did a minor cleanup and rearrange to make it more readable) I'm using Cloudera demo VM. Before I used chaining, my simple job worked well. This version just finishes under 10-20 seconds without any errors and any valuable information from log file. I'm pretty sure no single job is actually started, but I can't figure out why.

EDIT: the output directory is not created at all.

EDIT: I included the jobRunner and handleRun snippets into my code for debug from here. It runs for two iterations (i see "Still running" twice), and exits normally.

EDIT: I'm google-ing like a boss for hours. There seem to be many "working" examples, problems rise up with hadoop versions and correct API calls (many classes rise up with the same name across hadoop-core.jar).

Community
  • 1
  • 1
gyorgyabraham
  • 2,550
  • 1
  • 28
  • 46
  • Can please post your code – Rags Aug 27 '13 at 15:05
  • My code is in the post, see the pastebin link. – gyorgyabraham Aug 28 '13 at 11:20
  • I am not sure if the way you start the jobs is correct, i just checked the hadoop reference and it gives following line to start: `JobClient.runJob(conf2); Can you try this to be certain that it's not the thread construct causing the fail? – DDW Aug 28 '13 at 12:45

1 Answers1

0

This answer may help you. Based on the API you are using you have to keep change the map and reduce classes using setMapperClass and setReducerClass and submit job. Also If you want to give output of the previous job as input to the next one, use a string variable to give output path dynamically.(If you don't want this part you can go for scripting)

    String input=args[0];
    String out=args[1];
    String output = out+"job1";
    public static String OUTPUT_FILE_NAME = "/part-00000";

the following is for old API

  /*code for changing mapper and reducer classes*/        
  FileInputFormat.setInputPaths(conf, new Path(input));
  FileOutputFormat.setOutputPath(conf, new Path(output));
  JobClient.runJob(conf);
  input= output+OUTPUT_FILE_NAME;
  output = out + "job2";
   ......
   ......
Community
  • 1
  • 1