I am well familiar with Pandas data frame where I use function "mode" and "groupby" to get most frequent values,like below
df3=df5.groupby(['band']).apply(lambda x: x.mode())
however I am facing some difficulties to get in PySpark.
I have a spark data frame as follows:
band A3 A5 status
4G_band1800 12 18 TRUE
4G_band1800 12 18 FALSE
4G_band1800 10 18 TRUE
4G_band1800 12 12 TRUE
4g_band2300 6 24 FALSE
4g_band2300 6 22 FALSE
4g_band2300 6 24 FALSE
4g_band2300 3 24 TRUE
What I want is as follows:
band A3 A5 status
4G_band1800 12 18 TRUE
4g_band2300 6 24 FALSE
I have tried all possible combinations but haven't got any reasonable output. Please suggest a way.