How to do normalization with MinMaxscaler within each group after using group by to a spark dataframe?

Asked Jan 20 '17 at 07:38

Active Jan 20 '17 at 07:38

Viewed 287 times

By using group by, we get a GroupedData, how could I realize the normalization for each group of data seperatelly? Or for example, now I do something like

val df_list = trans.map(s => {
             println(s._1.toString)
             val scalerModel = scaler.fit(s._2)
             val scaledData = scalerModel.transform(s._2)
             scaledData})

where trans is an array of (string, df) and df is dataframe with "features"; I could realize in this way but not very efficient. Is there any better idea?

asked Jan 20 '17 at 07:38

WU Zijun

How to do normalization with MinMaxscaler within each group after using group by to a spark dataframe?

0 Answers0