2

I want to apply a function on all rows of DataFrame. Example:

|A  |B   |C   |
|1  |3   |5   |
|6  |2   |0   |
|8  |2   |7   |
|0  |9   |4   |


Myfunction(df)

Myfunction(df: DataFrame):{
//Apply sum of columns on each row
}

Wanted output:

1+3+5 = 9
6+2+0 = 8
...

How can that be done is Scala please? i followed this but got no luck.

Haha
  • 973
  • 16
  • 43

2 Answers2

1

It's simple. You don't need to write any function for this, all you can do is to create a new column by summing up all the columns you want.

scala> df.show
+---+---+---+
|  A|  B|  C|
+---+---+---+
|  1|  2|  3|
|  1|  2|  4|
|  1|  2|  5|
+---+---+---+


scala> df.withColumn("sum",col("A")+col("B")+col("C")).show
+---+---+---+---+
|  A|  B|  C|sum|
+---+---+---+---+
|  1|  2|  3|  6|
|  1|  2|  4|  7|
|  1|  2|  5|  8|
+---+---+---+---+

Edited:

Well you can run map function on each row and get the sum using row index/field name.

scala> df.map(x=>x.getInt(0) + x.getInt(1) + x.getInt(2)).toDF("sum").show
+---+
|sum|
+---+
|  6|
|  7|
|  8|
+---+


scala> df.map(x=>x.getInt(x.fieldIndex("A")) + x.getInt(x.fieldIndex("B")) + x.getInt(x.fieldIndex("C"))).toDF("sum").show
+---+
|sum|
+---+
|  6|
|  7|
|  8|
+---+
Goldie
  • 164
  • 12
0

Map is the solution if you want to apply a function to every row of a dataframe. For every Row, you can return a tuple and a new RDD is made.

This is perfect when working with Dataset or RDD but not really for Dataframe. For your use case and for Dataframe, I would recommend just adding a column and use columns objects to do what you want.

// Using expr
df.withColumn("TOTAL", expr("A+B+C"))
// Using columns
df.withColumn("TOTAL", col("A")+col("B")+col("C"))
// Using dynamic selection of all columns
df.withColumn("TOTAL", df.colums.map(col).reduce((c1, c2) => c1 + c2))

In that case, you'll be very interested in this question. UDF is also a good solution and is better explained here.

If you don't want to keep source columns, you can replace .withColumn(name, value) with .select(value.alias(name))

Rafaël
  • 977
  • 8
  • 17