3

I have created a custom UDF in Hive, it's tested in Hive command line and works fine. So now I have the jar file for the UDF, what I need to do so that users will be able to create temporary function pointing to it? Ideally from command prompt of Hive I would do this:-

hive> add jar myudf.jar;
Added [myudf.jar] to class path
Added resources: [myudf.jar]
hive> create temporary function foo as 'mypackage.CustomUDF';

After this I am able to use the function properly.

But I don't want to add jar each and every time I want to execute the function. I should be able to run this function while:-

  1. executing Hive query against HDInsight cluster from Visual Studio
  2. executing Hive query from command line through SSH(Linux) or RDP/cmd(Windows)
  3. executing Hive query from Ambari (Linux) Hive view
  4. executing Hive query from HDinsight Query Console Hive Editor(Windows cluster)

So, no matter how I am executing the query the JAR should be already available and added to the path. What's the process to ensure this for Linux as well as Windows cluster?

DennisLi
  • 3,915
  • 6
  • 30
  • 66
Dhiraj
  • 3,396
  • 4
  • 41
  • 80
  • Check this.. https://issues.apache.org/jira/browse/HIVE-6047 https://issues.apache.org/jira/secure/attachment/12626615/PermanentFunctionsinHive.pdf – Munesh Jul 22 '16 at 01:34
  • This is not what I meant. I don't mind re-registering using add jar command. The question was about adding JAR to path not permanent UDF. I wanted to understand the steps (where to copy JAR etc) so that it will be available through Hive irrespective of how Hive is being accessed for that cluster. As of now I can connect to the cluster (headnode) using SSH and copy JAR to my home dir of headnode and issue the add jar command. But what if I am using Hive through web UI (HDInsight Hive Editor) or Ambari Hive View or using Visual Studio to issue the command. – Dhiraj Jul 22 '16 at 01:41
  • To clarify it further, I connected to the headnode of HDInsight Hadoop(Windows) cluster using RDP. Copied the JAR file to one of the folders which was already in system path (appearing as one of the folders in PATH variable for windows). Still when I issued add jar command from Hive prompt it said it didn't find the jar file. This is what I want to avoid. It looks like Hive has its own path variable. – Dhiraj Jul 22 '16 at 02:06

1 Answers1

1

may be you could add the jar in hiverc file present in hive etc/conf directory. This file will be loaded every time when hive starts. So from next time you need not to add jar separably for that session.