Is Spark SQL UDAF (user defined aggregate function) available in the Python API?

Question

As of Spark 1.5.0 it seems possible to write your own UDAF's for custom aggregations on DataFrames: Spark 1.5 DataFrame API Highlights: Date/Time/String Handling, Time Intervals, and UDAFs

It is however unclear to me if this functionality is supported in the Python API?

No, it is not supported. You can call Scala UDAF but it is not pretty. See [my answer](http://stackoverflow.com/a/33257733/1560062) to [Spark: How to map Python with Scala or Java User Defined Functions?](http://stackoverflow.com/q/33233737/1560062) for a full example. — zero323, Nov 03 '15 at 15:17
Possible duplicate of [Spark: How to map Python with Scala or Java User Defined Functions?](https://stackoverflow.com/questions/33233737/spark-how-to-map-python-with-scala-or-java-user-defined-functions) — eje, Jul 07 '17 at 18:10

score 2 · Answer 1 · edited May 23 '17 at 11:47

You cannot defined Python UDAF in Spark 1.5.0-2.0.0. There is a JIRA tracking this feature request:

resolved with goal "later" so it probably won't happen anytime soon.

1 Answers1