1

I am trying to use mapPartitions function instead of using map, the problem is that I want to pass an Array as an argument, but mapPartitions does not take Array as an argument. How can I pass the array as argument?

mapPartitions[U: ClassTag](
    f: Iterator[T] => Iterator[U], preservesPartitioning: Boolean = false)
zero323
  • 322,348
  • 103
  • 959
  • 935
zhengjw
  • 41
  • 3
  • Are you trying to access data in the array within the `mapPartitions`? If so, you could simply broadcast the array as a variable. – Glennie Helles Sindholt Oct 20 '15 at 10:36
  • Could you either [accept the answer](https://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work) or explain why it doesn't work for you so it can be improved? You also have quite a few you other question with answer just asking for being accepted. Thanks in advance. – zero323 Apr 24 '16 at 12:09

1 Answers1

6

It is not clear what you're asking so I am going to guess that you have a function that looks more or less like this:

def foo(iter: Iterator[T], xs: Array[V]): Iterator[U] = ???

and you want to pass it to mapPartitions.

You have three options:

  1. Use an anonymous function:

    val xs: Array[V] = ???
    val rdd: RDD[U] = ???
    
    rdd.mapPartitions(iter => foo(iter, xs))
    
  2. Rewrite foo to support currying:

    def foo(xs: Array[V])(iter: Iterator[T]): Iterator[U] = ??? // Rest as before
    
    rdd.mapPartitions(foo(xs))
    
  3. Curry foo like this:

    val bar = (iter: Iterator[T]) => foo(iter, xs))
    
    rdd.mapPartitions(bar)
    
zero323
  • 322,348
  • 103
  • 959
  • 935
  • I just encountered what seems to be the same limitation as what this question (or rather answer) seem to deal with. I tried the line of thought implied in this answer and made a clearer question of it [here](https://stackoverflow.com/q/61527641/1509695). – matanster Apr 30 '20 at 16:12