Flink-statefun dynamic function discovery and fan-out execution

Question

What would be a scalable way to dynamically register and call remote-statefun? I know I can register statefun while submitting a flink job but it's not ideal to submit a new build per new function. I wonder why would flink need to know about remote functions at job start.

If I used the statefun template url as http endpoint, is it possible to dynamically discover remote functions under a namespace?


spec:

  functions: com.example/*

  urlPathTemplate: https://bar.foo.com/{function.name}

Where function.name is dynamically generated UUID. I don't yet understand how this would work though.

Alternatively, we might be able to leverage broadcast state option(assuming remote statefun can be invoked from KeyedBroadcastProcessFunction). Say, we maintain a map of functions in any external storage e.g s3

The second approach:

Create a KeyedBroadcastProcessFunction that reads current state of function map when the function is open(..)
Send SNS notification when new function is deployed
Read the newly added s3 file by reading SNS notification in the processBroadcastElement method and update a flink state's broadcast state descriptor
All operator instances will share the same underlying broadcasted function map
The KeyedBroadcastProcessFunction will send each new message received in processElement function to all functions in the broadcasted function map

Third and possibly the simplest approach could be to register a process time timer and call s3 to fetch updated function map in onTime handler every 5mins.

Which would a preferred option? any pointers on trade-off analysis between these approaches? (apart from the time lag to discover newly added functions in the third approach)

Flink-statefun dynamic function discovery and fan-out execution

0 Answers0