0

Title says it all:

Is there an equivalent to the SPARK SQL LATERAL VIEW command in the Spark API so that I can generate a column from a UDF that contains a struct of multiple columns worth of data, and then laterally spread the columns in the struct into the parent dataFrame as individual columns?

Something equivalent to df.select(expr("LATERAL VIEW udf(col1,col2...coln)"))

thebluephantom
  • 16,458
  • 8
  • 40
  • 83
Rimer
  • 2,054
  • 6
  • 28
  • 43

1 Answers1

0

I solved this by selecting the udf into a column:

val dfWithUdfResolved = dataFrame.select(calledUdf()).as("tuple_column"))

... then ...

dfWithUdfResolved
  .withColumn("newCol1", $"tuple_column._1")
  .withColumn("newCol2", $"tuple_column._2")
  // ...
  .withColumn("newColn", $"tuple_column._n")

Basically using tuple notation to pull values out of the column into new discrete columns.

Rimer
  • 2,054
  • 6
  • 28
  • 43