I'm looking for a list of pre-defined aggregation functions in Spark SQL. I have in mind something analogous to Presto Aggregate Functions.
I Ctrl+F'd around a little in the SQL API docs to no avail... it's also hard to tell at a glance which functions are for aggregation vs. not. For example, if I didn't know avg
is an aggregation function I'd be hard pressed to tell it is one (in a way that's actually scalable to the full set of functions):
avg
-avg(expr)
- Returns the mean calculated from values of a group.
If such a list doesn't exist, can someone at least confirm to me that there's no pre-defined function like any
/bool_or
or all
/bool_and
to determine if any or all of a boolean
column in a group are true
(or false
)?
For now, my workaround is
select grp_col, count(if(bool_col, true, NULL)) > 0 any_agg