How to add third party jar to mapreduce job?

Question

Looking for the solution to add third party jar into mapreduce job. Currently, we are bundling third-party jar into map reduce job. Seems sometimes size of mapreduce job jar is going to high. is there another approach to overcome this problem

score 1 · Answer 1 · answered Apr 30 '18 at 16:29

1

I believe "-libjars jar1,jar2,..." is what you need here

answered Apr 30 '18 at 16:29

alex-arkhipov

72
1
7

@LearnHadoop it is an option of hadoop command. Like this: "hadoop jar mapreduce.jar -libjars " – alex-arkhipov Apr 30 '18 at 17:37

score 0 · Accepted Answer · answered Apr 30 '18 at 11:52

0

Generally going down the uber jar route is the is a good one, adding jars to the generic java classpath becomes problematic in the event you have dependencies on a different version of the same jar for different MapReduce jobs.

answered Apr 30 '18 at 11:52

shaine

549
5
12

Generally going down the uber jar route is the is a good one - what does it mean.. can you please elaborate more – Learn Hadoop Apr 30 '18 at 12:33
1

you pack all your dependency jars in the jar you build to run your application. (this is a much better explanation than I could give https://stackoverflow.com/questions/11947037/what-is-an-uber-jar ) – shaine Apr 30 '18 at 13:52

score 0 · Answer 3 · answered Apr 30 '18 at 12:01

0

Use the below sentence.

export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/path/to/my/jar1:/path/to/my/jar2

Then you can run your hadoop jobs as usual: hadoop jar [mainClass]. For more details check this out.

answered Apr 30 '18 at 12:01

dbustosp

4,208
25
46

thanks .. if you could add in CLASSPATH may be it will come version issue.. – Learn Hadoop Apr 30 '18 at 12:29

How to add third party jar to mapreduce job?

3 Answers3