Return multiple RDD in a simple run of map operation

Question

I am now doing some operations using GraphX and want something like this

val ans = graph.triplets.map(
    e => {
        if (conditon1){
            return ans_1 to RDD_1
        }
        else (condition2){
            return ans_2 to RDD_2
        }
    }
)

I know I can use double runs of graph.triplets.map to return 2 different RDD, like this

val RDD_1 = graph.triplets.map(
    e => {
        if (conditon1){
            return ans_1
        }
    })
val RDD_2 = graph.triplets.map(
    e => {
        if (condition2){
            return ans_2
        }
    })

However in order to improve the efficiency I want to do it in a single run as I depicted above. How can I achieve it?

@RameshMaharjan If so, how can I create this kind of tuple, what is its structure? Because RDD_1 and RDD_2 are not the same size. — Litchy, May 07 '18 at 05:57
@RameshMaharjan I added the 2 runs version, it is obviously slower because we have run one more traversal — Litchy, May 07 '18 at 06:33
@RameshMaharjan Must it have a else expression? The official example of Spark does not, in `spark/examples/graphx/AggregateMessagesExample.scala` — Litchy, May 07 '18 at 06:44
why don't you use filter instead of map. filter two times as you are doing with map. that would be more efficient than map. — Ramesh Maharjan, May 07 '18 at 07:41
Possible duplicate of [How do I split an RDD into two or more RDDs?](https://stackoverflow.com/questions/32970709/how-do-i-split-an-rdd-into-two-or-more-rdds) — Alper t. Turker, May 07 '18 at 09:52
@user9613318 thank you. But that is in Python and it is the Python's feature grammar. How can I use it in scala? — Litchy, May 07 '18 at 10:04
@Litchy I don't think that the code is that important there. More the conclusion, that you if want two RDDs, you'll need two separate transformations. — Alper t. Turker, May 07 '18 at 10:10

Return multiple RDD in a simple run of map operation

0 Answers0