I tried using .agg(avg("boolean_column"))
, but got the error:
"function average requires numeric types, not boolean"
How can I get the average of such a column?
I tried using .agg(avg("boolean_column"))
, but got the error:
"function average requires numeric types, not boolean"
How can I get the average of such a column?
Convert the column to a numeric type, then take the average:
from pyspark.sql.functions import avg, col
df.groupBy(...).agg(avg(col("boolean_column").cast("double")))