1

I am writing a UDF for Hive using Java. I have to read excel files stored in hdfs and do some processing using udf. I am using poi library for processing.

When we read from hdfs we get FSDataInputStream but When we create Workbook using POI it needs InputStream Object.
Though code is not giving error at compile time.

FSDataInputStream stream = hdfs.open(new Path(inputFile));
Workbook workbook= new XSSFWorkbook(stream);

But When I create temporary function I get:

Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.FunctionTask. org/apache/poi/ss/usermodel/Workbook

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Sachit Murarka
  • 137
  • 2
  • 12

1 Answers1

1

It looks like you didn't add all dependent jars. You need to either build your jar with dependency or add all dependent jars one by one.

Command: hive> add myjar.jar

Luk
  • 2,186
  • 2
  • 11
  • 32
  • In the maven dependencies(pom file) , I have specified all poi dependencies. – Sachit Murarka Feb 08 '18 at 13:38
  • Yes, but by default, when you are using maven to create jar it will not add dependent jars. Only classes you have created. Look at this question https://stackoverflow.com/questions/574594/how-can-i-create-an-executable-jar-with-dependencies-using-maven – Luk Feb 08 '18 at 13:43
  • Yes, as said in link I have provided above you need to build jar with dependency. Then add this jar to hive, like have you done before and then error should not appear anymore. – Luk Feb 08 '18 at 13:56
  • After which tag do I need to write this set of properties? It is giving error in pom xml – Sachit Murarka Feb 08 '18 at 14:09
  • Can you please tell @Luk – Sachit Murarka Feb 09 '18 at 05:09
  • I have removed main class since in Hive UDF we don't have any main class.Now its working. Thanks @Luk – Sachit Murarka Feb 09 '18 at 13:35