2

I'm working with cassandra + spark + spark-sql. I'n not using Hive. I'd like to create my custom aggregation function, like:

select percentile(column, 0.95) from cassandra_table

spark-sql support avg(), min(), etc.: I want to implement others like percentile, but I cannot find documentation on this.

Can someone point me to to any doc or class to start with?

Thanks!

RJtokenring
  • 371
  • 1
  • 4
  • 12
  • This could probably help: http://stackoverflow.com/a/32101530/935083 – rchukh Feb 11 '16 at 15:59
  • @rchukh Not so much. It is not possible to efficiently implement percentile function with Spark UDAF. I would simply use Hive UDF: http://stackoverflow.com/a/34521857/1560062 (and just to clarify Hive UDFs are not Hive specific). – zero323 Feb 11 '16 at 16:14

0 Answers0