0

Looking for the solution to add third party jar into mapreduce job. Currently, we are bundling third-party jar into map reduce job. Seems sometimes size of mapreduce job jar is going to high. is there another approach to overcome this problem

Learn Hadoop
  • 2,760
  • 8
  • 28
  • 60

3 Answers3

1

I believe "-libjars jar1,jar2,..." is what you need here

alex-arkhipov
  • 72
  • 1
  • 7
0

Generally going down the uber jar route is the is a good one, adding jars to the generic java classpath becomes problematic in the event you have dependencies on a different version of the same jar for different MapReduce jobs.

shaine
  • 549
  • 5
  • 12
  • Generally going down the uber jar route is the is a good one - what does it mean.. can you please elaborate more – Learn Hadoop Apr 30 '18 at 12:33
  • 1
    you pack all your dependency jars in the jar you build to run your application. (this is a much better explanation than I could give https://stackoverflow.com/questions/11947037/what-is-an-uber-jar ) – shaine Apr 30 '18 at 13:52
0

Use the below sentence.

export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/path/to/my/jar1:/path/to/my/jar2

Then you can run your hadoop jobs as usual: hadoop jar [mainClass]. For more details check this out.

dbustosp
  • 4,208
  • 25
  • 46