1

Here is my data:RDD[Array[String]] in spark. And I want to count the sum of all the elements length in data.

For example, data:(Array(1,2),Array(1,2,3)). I want to get the sum: 2+3=5; At first, I use :data.flatMap(_).count(),

Error :

error: missing parameter type for expanded function ((x$1) => data.flatMap(x$1))

But when I replace _ with x=>x and write: data.flatMap(x=>x).count(), it works. So I am confused by the _ . I think in scala _ can be referred as the actual parameter type, right?

user3190018
  • 890
  • 13
  • 26
  • No idea why this is duplicate, a short answer is `_` is not `x => x`, but `_ + 1` is `x => x + 1` and `_.toArray[Int]` is `x => x.toArray[Int]`. If you use the last one in your code, it will achieve the same result. A rule of thumb is to use `_` only and only if you have **one** operator to be applied, not two and not zero. – yuxiang.li Apr 06 '18 at 18:28

1 Answers1

-2

Refer to the question here.

Essentially, _ itself does not define a function. It can be used as a placeholder for a variable name when used in the anonymous function syntax, but when used by itself it means nothing.

Azure Heights
  • 271
  • 2
  • 8