How can I achieve this ?
from pyspark.sql import functions as F
from pyspark.context import SparkContext
from pyspark.sql.session import SparkSession
sc = SparkContext('local')
spark = SparkSession(sc)
grouped=df.groupby([col list]).agg(F.count([col list]))
I've read the similar questions on stackoverflow but could not find the exact answer.
Even if I try to put a single column
grouped=dfn.groupby('col name').agg(F.count('col name'))
I get -
py4j\java_collections.py", line 500, in convert for element in object: TypeError: 'type' object is not iterable
Reference to question - pyspark Column is not iterable
I don't know the column names beforehand and need to provide list as input to the group by agg functions.