0

Assuming there is a finite DataStream (from a database source, for example) with events

  • a1, a2, ..., an.

How to append one more event b to this stream to get

  • a1, a2, ..., an, b

(i.e. output the added event after all original events, preserving the original ordering)?

I know that all finite streams emit the MAX_WATERMARK after all events. So, is there a way to "catch" this watermark and output the additional event after it?

(Unfortunately, .union()ing the original DataStream with another DataStream consisting of a single event (with timestamp set to Long.MaxValue) and then sorting the united stream using this answer did not work.)

trolley813
  • 874
  • 9
  • 15
  • Do you know the count ahead of time? Also, if it is a finite set, why can't you use the DataSet API instead of the DataStream? – austin_ce Feb 14 '19 at 15:18

2 Answers2

2

Maybe I'm missing something, but it seems like you could simply have a ProcessFunction with an event time timer set for somewhere in the distant future, so that it only fires when the MAX_WATERMARK arrives. And then in the onTimer method, emit that special event if the currentWatermark is MAX_WATERMARK.

David Anderson
  • 39,434
  • 4
  • 33
  • 60
0

Another approach might be to 'wrap' the original data source in another data source, which emits a final element when the delegate object's run() method returns. You'd need to be careful to call through to all of the delegate methods, of course.

kkrugler
  • 8,145
  • 6
  • 24
  • 18