-2

We want to remind users to complete their workflow. These workflow events look like 'Workflow started', 'progressed stage 1', 'progressed stage 2',... 'Workflow ended' and they flow through Kafka. Each event has a unique identifier to identify a workflow attempt by the user.

How do we design a pipeline in Flink to detect workflows that have started but abandoned in the middle? Is there any established pattern for this?

siliconsenthil
  • 1,380
  • 1
  • 14
  • 25

2 Answers2

1

You can use processFunction timers I think. Timers

Kenank
  • 321
  • 1
  • 10
  • We ended up following the approach similar to https://stackoverflow.com/a/47071833/227705. For ensuring replay-ability we had to consider this as well. https://medium.com/bird-engineering/replayable-process-functions-in-flink-time-ordering-and-timers-28007a0210e1 – siliconsenthil Dec 01 '22 at 10:46
0

We ended up building with a timeout process function. We process each event of a workflow attempt and set a timer to fire.

Instant timerFireAt = event.getTimestamp().plusSeconds(timeoutDuration);

context.timerService().registerProcessingTimeTimer(timerFireAt.toEpochMilli();

This keeps getting updated with each incoming event of the same workflow attempt. On completion of the attempt, we delete the timer. If it's not deleted i.e. if there are no events for certain time, the timer fires.

siliconsenthil
  • 1,380
  • 1
  • 14
  • 25