I have created a custom UDF in Hive, it's tested in Hive command line and works fine. So now I have the jar file for the UDF, what I need to do so that users will be able to create temporary function pointing to it? Ideally from command prompt of Hive I would do this:-
hive> add jar myudf.jar;
Added [myudf.jar] to class path
Added resources: [myudf.jar]
hive> create temporary function foo as 'mypackage.CustomUDF';
After this I am able to use the function properly.
But I don't want to add jar each and every time I want to execute the function. I should be able to run this function while:-
- executing Hive query against HDInsight cluster from Visual Studio
- executing Hive query from command line through SSH(Linux) or RDP/cmd(Windows)
- executing Hive query from Ambari (Linux) Hive view
- executing Hive query from HDinsight Query Console Hive Editor(Windows cluster)
So, no matter how I am executing the query the JAR should be already available and added to the path. What's the process to ensure this for Linux as well as Windows cluster?