I want to merge 2 records based on key but don't want to miss unpaired records too. For example, I have the below paired RDD:
(key=1, (2, created_on))
(key=1, (3, created_on))
(key=2 (5, created_on))
Now when I use reduceByKey
on function for latest 'created_on'
, it merges first 2 records and get 1 record which is most recent. This is the correct behavior.
However, the 3rd record is missing. How I can get the unpaired rdd record so that I can union it to merged RDD?