I have a dataframe with configurable column names eg
Journey channelA channelB channelC
j1 1 0 0
j1 0 1 0
j1 1 0 0
j2 0 0 1
j2 0 1 0
By configurable I mean there could be 'n' channels in the dataframe.
Now I need to a transformation in which I need to find the sum of all channels something like
df.groupBy("Journey").agg(sum("channelA"), sum("channelB"), sum("channelC"))
The output of which would be :
Journey sum(channelA) sum(channelB) sum(channelC)
j1 2 1 0
j2 0 1 1
Now i want to rename the column names to the original names and I could do it with
.withColumnRenamed("sum(channelA)", channelA)
but as i mentioned the channel list is configurable and I would want a generic column rename statement to rename all my summed columns to the original column names to get an expected dataframe as :
Journey channelA channelB channelC
j1 2 1 0
j2 0 1 1
Any suggestions how to approach this