I have created a SQL UDF which takes one string parameter and returns string.
Reference doc : https://docs.databricks.com/sql/language-manual/sql-ref-syntax-ddl-create-sql-function.html#parameters
I am trying to use this function inside Spark SQL's array_transform method.
SQL UDF function DDL
CREATE or REPLACE FUNCTION printStr(str String )
RETURNS STRING
COMMENT 'print given string'
LANGUAGE SQL
RETURN str
Temp view to test SQL UDF
CREATE or REPLACE TEMP VIEW values AS
select * from values (array("1","2","3")),(array("2","3","5")),(array("3","44"))
Calling this SQL UDF inside Spark SQL's array_transform method.
select transform(col1, val -> printStr(val)) from VALUES
Exception: (Spark version - 3.2.1, Scala - 2.12, Databricks runtime - 10.4 LTS)
Error in SQL statement: AnalysisException: Resolved attribute(s) val#1298163 missing from in operator !Project [cast(lambda val#1298163 as string) AS str#1298165].; line 1 pos 30
com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: org.apache.spark.sql.AnalysisException: Resolved attribute(s) val#1298163 missing from in operator !Project [cast(lambda val#1298163 as string) AS str#1298165].; line 1 pos 30
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.failAnalysis(CheckAnalysis.scala:60)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.failAnalysis$(CheckAnalysis.scala:59)
at org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:225)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$2(CheckAnalysis.scala:533)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$2$adapted(CheckAnalysis.scala:105)
at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:358)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1(TreeNode.scala:357)
at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreachUp$1$adapted(TreeNode.scala:357)
at scala.collection.Iterator.foreach(Iterator.scala:943)
at scala.collection.Iterator.foreach$(Iterator.scala:943)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
at scala.collection.IterableLike.foreach(IterableLike.scala:74)
at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:357)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1(CheckAnalysis.scala:105)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:80)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis(CheckAnalysis.scala:100)
Am guessing namedLambdaVariable is not getting resolved inside SQL UDf's