5

Created one project 'spark-udf' & written hive udf as below:

package com.spark.udf
import org.apache.hadoop.hive.ql.exec.UDF

class UpperCase extends UDF with Serializable {
  def evaluate(input: String): String = {
    input.toUpperCase
  }

Built it & created jar for it. Tried to use this udf in another spark program:

spark.sql("CREATE OR REPLACE FUNCTION uppercase AS 'com.spark.udf.UpperCase' USING JAR '/home/swapnil/spark-udf/target/spark-udf-1.0.jar'")

But following line is giving me exception:

spark.sql("select uppercase(Car) as NAME from cars").show

Exception:

Exception in thread "main" org.apache.spark.sql.AnalysisException: No handler for UDAF 'com.spark.udf.UpperCase'. Use sparkSession.udf.register(...) instead.; line 1 pos 7 at org.apache.spark.sql.catalyst.catalog.SessionCatalog.makeFunctionExpression(SessionCatalog.scala:1105) at org.apache.spark.sql.catalyst.catalog.SessionCatalog$$anonfun$org$apache$spark$sql$catalyst$catalog$SessionCatalog$$makeFunctionBuilder$1.apply(SessionCatalog.scala:1085) at org.apache.spark.sql.catalyst.catalog.SessionCatalog$$anonfun$org$apache$spark$sql$catalyst$catalog$SessionCatalog$$makeFunctionBuilder$1.apply(SessionCatalog.scala:1085) at org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistry.lookupFunction(FunctionRegistry.scala:115) at org.apache.spark.sql.catalyst.catalog.SessionCatalog.lookupFunction(SessionCatalog.scala:1247) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$16$$anonfun$applyOrElse$6$$anonfun$applyOrElse$52.apply(Analyzer.scala:1226) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$16$$anonfun$applyOrElse$6$$anonfun$applyOrElse$52.apply(Analyzer.scala:1226) at org.apache.spark.sql.catalyst.analysis.package$.withPosition(package.scala:48)

Any help around this is really appreciated.

Swapnil Chougule
  • 717
  • 9
  • 17
  • 1
    why do you want to write a hive-UDF to use in Spark? Better define a spark UDF if you want to use it in spark – Raphael Roth Sep 04 '18 at 11:35
  • I have also tried spark UDF but got same exception: import org.apache.spark.sql.api.java.UDF1 class UpperCase extends UDF1[String, String] with Serializable { override def call(t1: String): String = t1.toUpperCase } – Swapnil Chougule Sep 04 '18 at 11:38
  • How are you adding the UDF jar to your Spark program ? – philantrovert Sep 04 '18 at 12:53
  • @philantrovert I am adding jar is same command "CREATE OR REPLACE FUNCTION" CREATE OR REPLACE FUNCTION uppercase AS 'com.spark.udf.UpperCase' USING JAR '/home/swapnil/spark-udf/target/spark-udf-1.0.jar' – Swapnil Chougule Sep 05 '18 at 12:15

2 Answers2

2

As mentioned in comments, it's better to write Spark UDF:

val uppercaseUDF = spark.udf.register("uppercase", (s : String) => s.toUpperCase)
spark.sql("select uppercase(Car) as NAME from cars").show

Main cause is that you didn't set enableHiveSupport during creation of SparkSession. In such situation, default SessionCatalog will be used and makeFunctionExpression function in SessionCatalog scans only for User Defined Aggregate Function. If function is not an UDAF, it won't be found.

Created Jira task to implement this

T. Gawęda
  • 15,706
  • 4
  • 46
  • 61
  • @SwapnilChougule Changed slightly my answer, please check it - should work :) – T. Gawęda Sep 04 '18 at 17:05
  • 1
    @Gaweda I had enabled Hive support also. With hive support, hive udf works. I want to use scala UDF through external jar. Even I observed makeFunctionExpression only supports UDAF but not UDF. This is what I was seeking. You have opened JIRA for this :) – Swapnil Chougule Sep 05 '18 at 12:14
  • Thanks enabling hive was what I was lacking of. – user2015762 Feb 17 '22 at 23:12
  • Hello Sir @Gaweda, I'd like to use `spark.udf.register()` to add UDF from JAR files. Could you please give me some pointers/references? – Smile Apr 12 '22 at 21:00
0

Issue is class needs to be public.

package com.spark.udf
import org.apache.hadoop.hive.ql.exec.UDF

public class UpperCase extends UDF with Serializable {
  def evaluate(input: String): String = {
    input.toUpperCase
  }
}
Ranga Reddy
  • 2,936
  • 4
  • 29
  • 41
  • Issue was with spark 2.3.0. Please test with it. There was no handler as mentioned above. Don't know about recent versions. You can also check corresponding jira – Swapnil Chougule Apr 08 '21 at 07:42