I'm trying to implement the "moving median" function as a window function to use it in Apache Spark SQL.
I'm trying to implement it as a UDAF in Scala. The version of Spark is 1.6.1.
I tried 2 ways of calling my UDAF ("median"):
1) As an SQL query:
val timeSeries = ... // get a DataFrame
...
timeSeries.registerTempTable("time_series")
timeSeries.sqlContext.udf.register("median", new MedianUDAF)
val timeSeriesWithMovingAverage = timeSeries.sqlContext.sql(s"select *, median(value_column) over (partition by metrics_name order by time_column) from time_series")
The result was:
Failure(org.apache.spark.sql.AnalysisException: Couldn't find window function median;)
2) As a DataFrame API call:
val timeSeriesWithMovingAverage = timeSeries.withColumn("movingAvg", medianFunction(timeSeries("value_column")).over(windowSpec))
The result was:
Failure(java.lang.UnsupportedOperationException: MedianUDAF(value#16) is not supported in a window operation.)
Is there any way to use UDAFs as window functions? For example, to calculate moving median (not moving average but median).