I am using pyspark to find all possible pairs from an RDD of int array.
Input:
[[0, 1, 2],
[3, 4]]
Output RDD key value pairs of all possible combinations:
(0,1), (0,2), (1,0), .... (3,4), (4,3)
I want to implement it in python not scala
I am using pyspark to find all possible pairs from an RDD of int array.
Input:
[[0, 1, 2],
[3, 4]]
Output RDD key value pairs of all possible combinations:
(0,1), (0,2), (1,0), .... (3,4), (4,3)
I want to implement it in python not scala
permutations
can be used to used to generate tuples of length 2.
from itertools import permutations
rdd = spark.sparkContext.parallelize([[0, 1, 2],[3, 4]])
rdd.flatMap(lambda x: permutations(x,2)).collect()
[(0, 1), (0, 2), (1, 0), (1, 2), (2, 0), (2, 1), (3, 4), (4, 3)]