So I have an RDD with irregular time series data:
1, <value1>
4, <value4>
6, <value6>
..etc.
and I need to fill it into regular time series:
1, <value1>
2, <value1>
3, <value1>
4, <value4>
5, <value4>
6, <value6>
..etc.
So far I have created an RDD with 1,2,3,4,5,6,.. then leftOuterJoin'ed it to original RDD, which gave me:
1, <value1>
2, <None>
3, <None>
4, <value4>
5, <None>
6, <value6>
..etc.
So the problem I am facing is filling those 2,3,5 with values from previous non-Null row.
I would prefer to do it on RDD level without going to sparkSQL, which is of course a last resort option. Going to scala Array level isn't very inviting since for performance issues I would prefer to keep it on RDD level.
Thanks