1

I have a mareduce job as jar file , say 'mapred.jar'.Actually Jobtracker is running in a remote linux machine. I run the jar file from local machine, job in the jar file is submitted to the remote jobtracker and it works fine as below:

java -jar F:/hadoop/mapred.jar
     13/12/19 12:40:27 WARN mapred.JobClient: Use GenericOptionsParser for parsing th
     e arguments. Applications should implement Tool for the same.
     13/12/19 12:40:27 INFO input.FileInputFormat: Total input paths to process : 49
     13/12/19 12:40:27 WARN util.NativeCodeLoader: Unable to load native-hadoop libra
     ry for your platform... using builtin-java classes where applicable
     13/12/19 12:40:27 WARN snappy.LoadSnappy: Snappy native library not loaded
     13/12/19 12:40:28 INFO mapred.JobClient: Running job: job_201312160716_0063
     13/12/19 12:40:29 INFO mapred.JobClient:  map 0% reduce 0%
     13/12/19 12:40:50 INFO mapred.JobClient:  map 48% reduce 0%
     13/12/19 12:40:53 INFO mapred.JobClient:  map 35% reduce 0%
     13/12/19 12:40:56 INFO mapred.JobClient:  map 29% reduce 0%
     13/12/19 12:41:02 INFO mapred.JobClient:  map 99% reduce 0%
     13/12/19 12:41:08 INFO mapred.JobClient:  map 100% reduce 0%
     13/12/19 12:41:23 INFO mapred.JobClient:  map 100% reduce 100%
     13/12/19 12:41:28 INFO mapred.JobClient: Job complete: job_201312160716_0063
      ...

But when I executed the same through java's ProcessBuilder as below:

ProcessBuilder pb = new ProcessBuilder("java", "-jar", "F:/hadoop/mapred.jar");
    pb.directory(new File("D:/test"));
    final Process process = pb.start();
    InputStream is = process.getInputStream();
    InputStreamReader isr = new InputStreamReader(is);
    BufferedReader br = new BufferedReader(isr);
    String line;
    while ((line = br.readLine()) != null) {
      System.out.println(line);
    }

    System.out.println("Waited for: "+ process.waitFor());
    System.out.println("Program terminated! ");

It also worked, and I can view the job status through, http://192.168.1.112:50030/jobtracker.jsp.

Problem

My problem is, the java program wont end up, run indefinitely even if the mapreduce job completed !. Also I do not get any output message that i got through command line.How can I know the job get finished ?

Tom Sebastian
  • 3,373
  • 5
  • 29
  • 54

1 Answers1

2

You should probably redirect stderr to stdout before starting to read:

pb.redirectErrorStream(true)

The reason is described in documentation of the Process class:

... failure to promptly write the input stream or read the output stream of the subprocess may cause the subprocess to block, and even deadlock.

If you are using Java 7, where ProcessBuilder and Process are significantly improved, you could also just do

pb.inheritIO()

which will redirect the process's stderr and stdout to the ones of your Java process.

Update: By the way, you are better off submitting the Hadoop job using the Hadoop API (classes Job and Configuration), see e.g. Calling a mapreduce job from a simple java program

Community
  • 1
  • 1
Jakub Kotowski
  • 7,411
  • 29
  • 38