Update the jar of a hive UDF

Question

TL;DR: how can I update the jar of a custom UDF in hive?

I wrote my own (generic) udf, working very well. I can define a new function and use it with the command:

Now I want to update my udf, I thus want to put an updated version of the jar, with the same name in hdfs. Afterwards, what happens is:

first call to the function gives java.io.IOException: Previous writer likely failed to write hdfs://ip-10-0-10-xxx.eu-west-1.compute.internal:8020/tmp/hive/hive/_tez_session_dir/0de6055d-190d-41ee-9acb-c6b402969940/myfunc.jar Failing because I am unlikely to write too.
second call gives org.apache.hadoop.hive.ql.metadata.HiveException: Default queue should always be returned.Hence we should not be here.

The log file shows:

Localizing resource because it does not exist: file:/tmp/8f45f1b7-2850-4fdc-b07e-0b53b3ddf5de_resources/myfunc.jar to dest: hdfs://ip-10-0-10-129.eu-west-1.
compute.internal:8020/tmp/hive/hive/_tez_session_dir/994ad52c-4b38-4ee2-92e9-67076afbbf10/myfunc.jar
tez.DagUtils (DagUtils.java:localizeResource(961)) - Looks like another thread is writing the same file will wait.
tez.DagUtils (DagUtils.java:localizeResource(968)) - Number of wait attempts: 5. Wait interval: 5000
tez.DagUtils (DagUtils.java:localizeResource(984)) - Could not find the jar that was being uploaded

What I already tried:

add the jar to hive.reloadable.aux.jars.path and hive.aux.jar.path
different combinations of list jar/delete jar/create function/reload to no avail.

I even end up have a query starting OK apparently but then just hangs there, not moving forward, nothing in the logs, no DAG created.

INFO  : converting to local hdfs:///hive-udf-wp/hive-udf-wp.jar
INFO  : Added [/tmp/19e0c9fc-9c7c-4de5-a034-ced062f87f64_resources/hive-udf-wp.jar] to class path
INFO  : Added resources: [hdfs:///hive-udf-wp/hive-udf-wp.jar]

I would think that asking tez to not reuse the current session could do the trick, as then new sessions would be created without an old version of the jar. Would that be an option?

score 0 · Answer 1 · edited Jun 30 '20 at 11:46

0

The only way I know to handle this is restart hive.
(I'm still looking for a good way to update udf.)

edited Jun 30 '20 at 11:46

Yunnosch

26,130
9
42
54

answered Jun 30 '20 at 11:44

workfox

21
4

Update the jar of a hive UDF

1 Answers1