Can I take a window over a sequence of values in Spark?

Asked Dec 09 '14 at 18:25

Active Dec 09 '14 at 18:25

Viewed 305 times

I've got sequence of values (from a reduce by key). I know that in theory the keys are an ordered sequence of things, and I should be able to reduce over them.

I want to run a window over these sequences. I could store the window in the accumulator for the reduce function, but I think that the way Spark parallelizes the workflow (requiring functions to be commutative and associative) means that the windows may get chopped.

Is there a way to do this?

asked Dec 09 '14 at 18:25

Joe

46,419
33
155
245

I think this is a duplicate of http://stackoverflow.com/questions/23402303/apache-spark-moving-average/23436517. – Daniel Darabos Dec 10 '14 at 09:12
It may be, although I'm looking for an operation on a key value set with reduceByKey. – Joe Dec 10 '14 at 12:56
Sorry, I missed that part. Then your data is not ordered in fact. In `spark-shell` check out `sc.parallelize(Seq(7 -> 1, 7 -> 1, 8 -> 1, 8 -> 1)).reduceByKey(_ + _).collect`. It returns `Array((8,2), (7,2))`. – Daniel Darabos Dec 10 '14 at 14:04
Well then I have no hope. I'm closing as a duplicate as you suggested, because there is always some flexibility. – Joe Dec 10 '14 at 14:39

Can I take a window over a sequence of values in Spark?

0 Answers0