I am trying to find out any information on the ordering of the rows in a RDD. Here is what I am trying to do:
Rdd1, Rdd2
Rdd3 = Rdd1.union(rdd2);
in Rdd3, is there any guarantee that rdd1 records will appear first and rdd2 afterwards? For my tests I saw this behaviorunion happening but wasn't able to find it in any docs.
just FI, I really do not care about the ordering of RDDs in itself (i.e. rdd2's or rdd1's data order is really not concern but after union Rdd1 record data must come first is the requirement).