I'm very new to using Google Cloud Dataflow. I would like to get the Cartesian product of two PCollections. For example, if I have two PCollections (1, 2)
and ("hello", "world")
, their Cartesian product is ((1, "hello"), (1, "world"), (2, "hello"), (2, "world"))
.
Any ideas how I could do that? Also, since the Cartesian product could be large, I'm hoping the solution would lazily create the product and thus avoid huge memory consumption.
Thanks!