I am new to pyspark so I wanted to know, is there any better way to do group by on multiple columns one by one instead of using loop over all columns? currenctly, I am using loop to iterate over all required group by columns but it is taking very long time. I have around 50-60 columns for which I need to group one by one using aggregration on fixed columns. current code using loop:
for name in req_string_columns:
tmp=Selected_data.groupBy(name).agg(mean("ABC"),mean("XYZ"),count("ABC")
,count("XYZ")).withColumnRenamed(name,'Category')
Is there any better way to do it?