I have data as below,
n1 d1 un1 mt1 1
n1 d1 un1 mt2 2
n1 d1 un1 mt3 3
n1 d1 un1 mt4 4
n1 d2 un1 mt1 3
n1 d2 un1 mt3 3
n1 d2 un1 mt4 4
n1 d2 un1 mt5 6
n1 d2 un1 mt2 3
Ii want to get the output as below
n1 d1 un1 0.75
n1 d2 un1 1.5
i,e do a groupby on 1st, 2nd and 3rd column and for 4th column follow the below formula, 4th column = within the group, (mt1+mt2)/mt4
I am trying to do the same with Spark DF assuming data is in dataframe a with column name as n,d,un,mt,r I am trying this.
sqlContext.udf.register("aggUDF",(v:List(mt,r))=> ?)
val b = a.groupBy("n","d","un").agg(callUdf("aggUDF",List((mt,r)) should go here))