How can a Hadoop MapReduce Job be executed from a Java program?

Question

I am trying to integrate a Hadoop MapReduce Job as part of a Hadoop Job Dispatching software-client (GUI) that I am developing as a personal project. I now have two files: the first being the client, and the second being a correctly functioning MapReduce program (that I am able to run on the Hadoop framework).

How am I able to execute my MapReduce program as a background process from the client I have created? Unfortunately, the answer mentioned here: [Calling a mapreduce job from a simple java program], just like many others I reviewed online seem to be depreciated, and do not mention the most up-to-date procedure for Hadoop 3 (I am currently using Hadoop 3.1.1).

Just as some additional background information: the primary purpose of the client I am developing is to allow individuals to dispatch jobs through a GUI, and gain real-time feedback and to conduct statistical analytics on the job process.

The end result I am aiming for is to allow my program to execute the MapReduce Job based on the click of a button. Thank you so much for your time and consideration, I look forward to seeing your replies :)

The way to submit jobs to YARN hasn't changed significantly from Hadoop 2 to 3, what is showing as deprecated? Why MapReduce vs using Spark and Livy REST API? — OneCricketeer, Jan 18 '19 at 05:18
Hi Cricket, I am actually a student (in my final year of high-school) that's currently working on this as a personal project. Unfortunately because of this I'm not too confident on configuring and using Spark and Livy (I haven't really experimented with either of those before to be completely honest). I have a simple MapReduce job written in Java which I would like to execute as a background process (on click) - based on a GUI I have written (in Java as well). I am unsure if the accepted answer on the thread linked above will work in my specific scenario. — Vilitaria, Jan 21 '19 at 22:07
Unfortunately the accepted answer's information was posted over seven years ago now, and the comment section below seemed to veer quite off the initial topic, and so I would like clarification whether that is the most 'up-to-date' method in which a Hadoop Job should be executed from a Java program. I hope that's able to clarify things a little. Thanks a ton for your time and consideration - I look forward to hearing back from you :) — Vilitaria, Jan 21 '19 at 22:14
PS: I am aware of System.exec (Runtime.exec) and ProcessBuilder, but I'm just a little unsure if this is the ideal way I should be calling a MapReduce program for Hadoop from a GUI client I have coded up. — Vilitaria, Jan 21 '19 at 23:46
Well, you need a JAR file, and then you need to execute this method in `org.apache.hadoop.util.RunJar` - https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/RunJar.java#L239 , which is the same thing that `hadoop jar` or `yarn jar` will do — OneCricketeer, Jan 22 '19 at 05:25

How can a Hadoop MapReduce Job be executed from a Java program?

0 Answers0