What is a watermark in Flink with respect to Event time processing? Why is it needed.? Why is it needed in all cases of event time being used. By all cases I mean if i dont do a window opeation then why do we still need a water mark. I come from spark background. In spark we need watermarks only when we use windows on the incoming events.
I have read few articles and it seems to me that watermarks and windows seems same.If there are differences please explain and point it put
Post your reply I did some more reading. Below is a query that is more specific.
Main Question:- Why do we need outoforder when we have acceptedlateness.
Given below example:
Assume you have a BoundedOutOfOrdernessTimestampExtractor with a 2 minute bound and a 10 minute tumbling window that starts at 12:00 and ends at 12:10:
12:01, A
12:04, B
WM, 12:02 // 12:04 - 2 minutes
12:02, C
12:08, D
12:14, E
WM, 12:12
12:16, F
WM, 12:14 // 12:16 - 2 minutes
12:09, G
In the above example [12:02, C] record is not dropped but included into the window 12:00 -12:10 and later evaluated.- Hence the watermark could as well be the event timestamp
The record [12:09, G] is included into the window 12:00 - 12:10 only when there is a acceptedlateness of 5mins configured. This takes care of late and out of order events
So now adding to my previous question above, what is the necessary of outoforder option to be BoundedOutOfOrdernessTimestampExtractor of some value(other than 0) instead of the event timestamp istelf ?
What is that outoforder can achieve which allowedlateness cannot and in what scenario it does?